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A Family of Transcriptional Co-repressors that 
Interact with Nuclgar Hormone Receptors 
and Uses Therefor 

5 RELATED APPLICATIONS 

This apphcation is a continuation-in-part apphcation of pending United 
States application Serial No. 08/522,726, filed September 1, 1995 and is related to 

United States application Serial No. , filed on even date herewith, each 

10 of which is incorporated herein in its entirety by reference. 

FIELD OF THE INVENTION 

The present invention relates to intracellular receptors, methods for the 
1 5 modulation thereof, and methods for the identification of novel hgands therefor. In a 
particular aspect, the present invention relates to methods for the identification of 
compounds which function as ligands (or ligand precursors) for intracellular receptors. 
In another aspect, the present invention relates to novel chimeric constructs and uses 
therefor. 

20 

BACKGROUND QF THE INVENTION 

A central problem in eukaryotic molecular biology continues to be the 
elucidation of molecules and mechanisms that mediate specific gene regulation. As part 
25 of the scientific attack on this problem, a great deal of work has been done in efforts to 
identify ligands (i.e., exogenous inducers) which are capable of mediating specific gene 
regulation. Additional work has been done in efforts to identify other molecules 
involved in specific gene regulation. 
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Although much remains to be learned about the specifics of gene 
regulation, it is known that ligands modulate gene transcription by acting in concert with 
intracellular components, including intracellular receptors and discrete DNA sequences 
5 known as hormone response elements (HREs). 



The identification of compounds that directly or indirectly interact with 
intracellular receptors, and thereby affect transcription of hormone-responsive genes, 
would be of significant value, e.g., for therapeutic appUcations. 

10 

Transcriptional silencing mediated by nuclear receptors plays an 
important role in development, cell differentiation, and is directly linked to the 
oncogenic activity of v-erbA. The mechanism underlying this effect is unknown but is 
one key to understanding the molecular basis of hormone action. Accordingly, the 
1 5 identification of components involved in transcriptional silencing would represent a 
great advance in current understanding of mechanisms that mediate specific gene 
regulation. 



Other information helpful in the understanding and practice of the 
20 present invention can be found in commonly assigned United States Patent Nos. 

5,071,773, 4,981,784, 5,260,432, and 5,091,513, all of which are hereby incorporated 
herein by reference in their entirety. 



BRIEF DESCRIPTION OF THE INVENTION 

25 

The present invention overcomes many problems in the art by providing 
a family of receptor interacting co-repressors, referred to herein as "SMRT co- 
repressor", i.e., a silencing mediator (co-repressor) for retinoic acid receptor (RAR) and 
thyroid hormone receptor (TR). In vivo, members of the SMRT family of co-repressors 
30 fimction as potent co-repressors. A GAL4 DNA binding domain (DBD) fiision with a 
SMRT co-repressor behaves as a fi-ank repressor of a GAL4-dependent reporter. 
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Together, these observations identify a novel family of cofactors that is believed to 
represent an important mediator of hormone action. 

Accordingly, the present invention provides isolated silencing 
5 mediators of retinoic acid and thyroid hormone receptors, and isoforms or peptide 
portions thereof (SMRT co-repressors), that modulate transcriptional potential of 
members of the nuclear receptor superfatnily. Such SMRT co-repressors comprise a 
repression domain having less than about 83% identity with a Sin3A interaction 
domain of N-CoR (amino acids 255 to 312 of SEQ ID NO: 11); less than about 57% 
1 0 identity with repression domain 1 of N-CoR (amino acids 1 to 3 12 of SEQ ID 

NO: 1 1); less than about 66% identity v^^ith a SANT domain of N-CoR (amino acids 
312 to 668 of SEQ ID NO: 11) and/or; less than about 30% identity with repression 
domain 2 of N-CoR (amino acids 736 to 1031 of SEQ ID NO: 11). 

15 In accordance with yet another embodiment of the present invention, 

there are provided isolated peptides comprising at least a portion of the invention 
SMRT co-repressor six contiguous amino acids of an amino acid sequence selected 
from the group consisting of: 

amino acids 1 to 1030 of SEQ ID NO: 5; 
20 amino acids 1 to 1029 of SEQ ID NO: 7; 

amino acids 1 to 809 of SEQ ID NO: 9; 
and conservative variations thereof, 
provided the peptide is not identical to a sequence of SEQ ID NO: 1 1 . 

25 In addition, there are provided isolated antibodies that bind specifically 

to invention isolated peptides. There are also provided chimeric molecules 
comprising invention isolated peptides and at least a second molecule. Also provided 
are complexes comprising an invention SMRT co-repressor and a member of the 
superfamily of nuclear receptors and isolated antibodies that bind to such complexes. 

30 
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Accordingly, the present invention provides isolated polynucleotides 
encoding members of the newly described family of silencing mediators of retinoic 
acid and thyroid hormone receptor or an isoform or peptide portion thereof (SMRT 
co-repressor), or an isolated polynucleotide complementary thereto. In addition, there 
5 are provided vectors comprising invention polynucleotides, as well as host cells 
containing invention polynucleotides. 

In additional embodiments of the present invention, there are provided 
methods for identifying agents that modulate the repressor potential of a SMRT co- 
1 0 repressor. 

In another embodiment according to the present invention, there are 
provided methods for identifying an agent that modulates a function of an invention 
SMRT co-repressor. 

15 

In another embodiment according to the present invention, there are 
provided methods of modulating the transcriptional potential of a member of the 
nuclear receptor superfamily (nuclear receptor) in a cell. 

20 In another embodiment according to the present invention, there are 

provided methods of identifying a molecule that interacts specifically with a SMRT 
co-repressor. 

BRIEF DESCRIPTION OF THE FIGURES 

25 

Figure 1 shows the quantitation by phosphoimager of a dose-dependent 
dissociation of SMRT from RAR or TR by all-trans retinoic acid (atRA) or thyroid 
hormone (triiodothyronine or T3). 

30 Figure 2 presents amino acid (aa) sequences of SMRT (Genbank 

accession number XXXXX). The aa sequence presented in parentheses (i.e., residues 
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1330-1376) is an alternatively spliced insert which is not present in the original 
two-hybrid clone (C-SMRT, aa 981 to C-terminal end). The proline-rich N-terminal 
domain (aa 1-160) and the glutamine-rich region (aa 1061-1 132), as well as the ERDR 
and SG regions, are also indicated. The C-terminal region of SMRT (aa 1201 to 
5 C-terminal end) shows 48% aa identity to RIPl 3 (Seol et al.. Molecular Endocrinology 
9:72-85 (1995)). The rest of the sequence of RIPU shows 22% aa identity to SMRT (aa 
819-1200). 

Figure 3 illustrates mediation of the silencing effect of hRARa and hTRp 
10 by SMRT m vivo. 

Figure 3(A) illustrates that v-erbA reverses the silencing effect of 
GAL-RAR (GAL4 DBD-hRARa 156-462) while SMRT restores the silencuig effect. 

1 5 Figure 3(B) illustrates that the RAR403 truncation mutant reverses the 

silencing effect of GAL-TR (GAL4 DBD-hTRp 173-456) while SMRT restores the 
silencing effect. 

Figure 3(C) illustrates that v-erbA and Ml length SMRT or C-SMRT 
20 have no effect on GAL- VP 1 6 activity. 

Figure 3(D) illustrates that a GAL4 DBD fusion of full length SMRT 
represses the thymidine kinase basal promoter activity containing four GAL4 binding 
sites. The fold of repression was calculated by dividing the normalized luciferase 
25 activity transfected with the GAL4 DBD alone by those transfected with indicated 
amount of GAL DBD fusion constructs. 

Figure 4 provides an ahgnment of the human SMRT (SEQ ID NO: 5) 
and mouse SMRTa (SEQ ID NO: 7) amino acid sequences. Proteins were aligned 
30 using the CLUSTAL alignment program. Underlined sequence of mouse SMRTa 
corresponds to the amino acid sequences that are deleted in mouse SMRTp. The 



6 



arrow indicates the start point of the previously described human SMRT co-repressor 
(sSMRT). 

Figures 5A and 5B provide aHgnments of the human SMRT and 
5 human N-CoR co-repressers. 

Figure 6A is a graph showing the results of transactivation experiments 
using transcripts encoding a detectable reporter and either wild type EcR (Ecr wt), a 
repression-Defective EcR allele Ecraa^^^"^ (EcRA483T) or vpl6 activation domain 
1 0 fiised to Ultraspiracle (vp 1 6-USP). 

Figure 6B is a graph showing the results of transactivation experiments 
using CMV promoter-driven expression vectors. Wild-type EcR or EcR A483T was 
cotransfected with vpl 6-USP and Gal4-c-SMRT (aa 981 to C terminus) (Chen and 
1 5 Evans, Nature 377:454-457, (1995)) into CV-1 cells to examine its effect on the 

interaction with vertebrate corepressor. All cells were also cotransfected with a TK- 
luciferase reporter construct, pMHlOO-TK-Luc, containing four copies of the yeast 
Gal4-responsive element. 

20 Figure 6C shows ahgnment of EcR, rTR, hRAR, and rRev-erbA 

receptor sequences and the secondary structure in the LBD signature motif region. 
Conserved residues are marked in dark. The mutation 483 (AT) is marked at the top 
of the corresponding residue. 

25 Figure 7 is a graph showing P-galactosidase activity in a yeast two- 

hybrid screen with pAS-EcR as bait. pAS-EcR is a fusion gene with the region 
corresponding to aa 223-878 of EcRBl fused C-terminally to the Gal4-DBD of the 
pASl-CYH2 construct (Durfee et al.. Genes Dev 7:555-569 (1993)); other Gal4-DBD- 
based nuclear receptor constructs used in this yeast two-hybrid assay include: USP 

30 (aa 50-508), hRAR (aa 186-462) and hTR (aa 121-410) (Schulman et al., Proc. Natl. 
Acad. Set USA, 92:8288-8292, (1995)), and SMRT (Chen and Evans, (1995), supra). 
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p-galactosidase activities were quantified by liquid assay for yeast cells treated either 
without Hgand or with 3 nM of corresponding hormone. All-trans retinoic acid 
(ATRA) is a ligand of RAR; 3,3',5-triiodothyroacetic acid (T3) is a ligand of TR. 
RAR, retinoic acid receptor; TR, thyroid hormone receptor. 

5 

Figure 8 A shows the complete amino acid sequence of the SMRTER 
protein (SEQ ID NO: 12). The underlined regions represent the residues also 
conserved in SMRT and N-CoR. The gray box indicates the sequences of the E52 
clone. 

10 

Figure 8B is a schematic structural diagram of SMRTER, SMRT, and 
N-CoR showing the conserved SNOR, SANT, GST, ITS, D/ER repeat, and LSD 
motifs with their designated patterns positioned in their relative regions in each 
protein. 

15 

Figure 9. Sequence Comparison of SMRTER, SMRT, N-CoR, and 
Other Related Proteins. The SANT domains of various proteins are hsted. Percent 
identities/similarities compared to SMRTER are shown on the right. Two potential 
hehces are predicted in the N-terminal half of the SANT domain. Black boxes 
20 indicate identical sequences; gray boxes, similar or partially identical sequences. 

Figure 10 is a schematic representation showing functional domains in 
SMRTER. Numbers on the left represent the regions in SMRTER used to generate 
the Gal4-DBD fusion genes. Black stippled bars indicate the locations of EcR- 
25 interacting domains; gray stippled bars indicate repression domains. Plus signs 
indicate that a positive interaction between SMRTER and the EcR complex and 
repression of basal activity by Gal4-SMRTER is significant. ERID = ecdysone 
receptor-interacting domain; SMRD = SMRTER repressor domain. 

30 Figure 1 1 A is a graph showing the interaction of ERID 1 AND ERID2 

with the EcR complex. Figure 1 IB is a graph showing the results of competition 
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between ERIDl, ERID2 and c-SMRT for binding to EcR. Figure 11 C is a graph 
showing that EcR A483T disrupts the interaction with ERIDl and ERID2. 

Figure 12 A shows the results of mapping three repression domains. To 
5 examine repressive activity, transcriptional activity of each Gal4-SMRTER fusion 
was compared to the basal activity of Gal4-DBD on reporter. Only repression with 
value approximately 5-fold or over is considered positive (+). 

Figure 12B is a schematic representation of mapping the SMRTER- 
1 0 interacting domain in mSin3A and dSinS A. Yeast two-hybrid assays were used to 
assess the interaction between each Gal4-DBD-based fusion gene of each SMRD and 
the ACT-based fusion genes of mSin3A and dSinSA. The numbers indicate the region 
in either mSinSA or in dSinSA used to generate the ACT fusion genes. Constructs of 
mSinSA were described previously in Nagy et al., Cell 89:373-380, (1997). 

15 

Figure 12C shows an ahgnment of SMRD3 of SMRTER and an 
mSinS-interacting domain of N-CoR. Conserved residues are boxed in gray. An 
asterisk indicates the region where the mutation (Gly) was generated. Minus signs 
indicate that the interaction between SMRD3 and Sin3 A was not detectable in the 
20 yeast two-hybrid assays. Repression was measured by comparing the transcriptional 
activity of Gal4-SMRD3 M2 or Gal4-SMRD3 M3 to that of wild-type Gal4-SMRD3 
using transfection experiments as described above. 

DRTATLED DESCRTPTTON OF THE INVENTION 

25 

In accordance with the present invention, there is provided a family of 
isolated SMRT co-repressors, and isoforms and peptide portions thereof, that modulate 
transcriptional potential of members of the nuclear receptor superfamily. Exemplary 
members of this family are co-repressors having substantially the same sequence as 
30 residues 1-1329 plus 1376-1495, as set forth in SEQ ID NO: 1, optionally further 
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comprising the amino acid residues set forth in SEQ ID NO:2 (i. e., residues 1330-1375 
ofSEQIDNO:l). 

In another embodiment according to the present invention, the 
5 invention SMRT co-repressor comprises a repression domain having less than about 
83% identity with a Sin3A interaction domain of N-CoR (as amino acids 255 to 312 
of SEQ ID NO: 1 1); less than about 57% identity with repression domain 1 of N-CoR 
(amino acids 1 to 3 12 of SEQ ID NO: 1 1); less than about 66% identity with a SANT 
domain of N-CoR (amino acids 312 to 668 of SEQ ID NO: 1 1 and/or; less than about 
1 0 30% identity with repression domain 2 of N-CoR (amino acids 736 to 103 1 of SEQ 
ID NO: 11). Such an encoded SMRT co-repressor or peptide portion thereof is 
further characterized in that it can modulate transcriptional potential of a member of 
the nuclear receptor superfamily (nuclear receptor). 

1 5 The invention SMRT co-repressors are additionally exemphfied by a 

full length human SMRT co-repressor, (amino acids 1 to 2517 of SEQ ID NO: 5); and 
by two mouse SMRT isoforms, including a longer SMRT isoform designated mouse 
SMRTa, which has an amino acid sequence set forth as amino acids 1 to 2473 of SEQ 
ID NO: 7; and a shorter SMRT isoform designated mouse SMRTP (amino acids 1 to 

20 2253 of SEQ ID NO: 9). As compared to the mouse SMRTa isoform (SEQ ID NO: 
7), the mouse SMRTP isoform (SEQ ID NO: 9) has a deletion corresponding to 
ammo acids 36 to 254 of SEQ ID NO: 7. 

A peptide portion of a SMRT co-repressor is exemplified herein by 
25 amino acids 1 to 1031 of SEQ ID NO: 5; amino acids 1 to 1031 of SEQ ID NO: 7; 

and amino acids 1 to 813 of SEQ ID NO: 9, which mcludes the entire amino terminal 
domain of a SMRT co-repressor. Additional peptide portions of a SMRT co- 
repressor are exemplified by amino acids 1 to 303 of SEQ ID NO: 7; amino acids 845 
to 986 of SEQ ID NO: 7; amino acids 427 to 663 of SEQ ID NO: 7; amino acids 845 
30 to 1055 of SEQ ID NO: 7; amino acids 736 to 1031 of SEQ ID NO: 7; and amino 

acids 1 to 85 of SEQ ID NO: 9, which are sub-domains of the amino terminal domain 
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of mouse SMRTa that have nuclear receptor repressor potential, as well as by the 
corresponding peptide portions of human SMRT and correspondmg peptide portions 
of mouse SMRTp, which can modulate the transcriptional potential of a nuclear 
receptor, particularly a nuclear receptor that is in the form of a dimer, for example, a 
5 thyroid hormone receptor homodimer, a retinoic acid receptor homodimer, a retinoid 
X receptor homodimer, a thyroid hormone receptor-retinoid X receptor heterodimer, 
or a retinoic acid receptor-retinoid X receptor heterodimer. In addition, the invention 
relates to isolated peptides that contain at least six contiguous amino acids of an 
amino acid sequence set forth as amino acids 1 to 1030 of SEQ ID NO: 5; amino 
10 acids 1 to 1029 of SEQ ID NO: 5; or amino acids 1 to 809 of SEQ ID NO: 9, provided 
the SMRT peptide is not identical to a sequence of N-CoR (SEQ ID NO: 11). 

Invention co-repressor can be an invertebrate SMRT co-repressor, such 
as the DrosophiUa SMRTER co-repressor having an amino acid sequence as set forth 
15 in SEQ ID NO: 12, or conservative variations thereof 

Additional exemplary co-repressors are those containing one or both of 
the receptor interacting domains (ERIDl and ERID2) identified in the Drosophilia co- 
repressor. For example, co-repressors containing such receptor interacting domains 
20 can be selected from the following segments of the Drosophilia SMRTER co- 
repressor (SEQ. ID 12): 

amino acids 1698-1924 of SEQ. ID NO:12, 

amino acids 2951-3038 of SEQ. ID NO:12, 

amino acids 1698-2063 of SEQ. ID NO: 12, 
25 amino acids 2094-3040 of SEQ. ID NO: 12, 

amino acids 2929-3181 of SEQ. ID NO:12, 

amino acids 542-950 of SEQ. ID NO: 12, 

amino acids 2094-3181 of SEQ IDN0:12, 

amino acids 2929-3040 of SEQ ID NO: 12, and 
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amino acids 2951-3038 of SEQ ID N0:12, 
and conservative variations thereof. 



Additional exemplary co-repressors are those containing one or more 
5 of three autonomous repressor domains termed SMRDl, SMRD2, and SMRD3 
identified in the SMRTER co-repressor. For example, invention co-repressors can 
contain the following autonomous repressor domains derived from Drosophilia 
SMRTER co-repressor (SEQ. ID 12): 

amino acids 542-950 of SEQ. ID NO: 12 
1 0 amino acids 1698-1924 of SEQ ID NO:12, 

amino acids 2951-3038 of SEQ. ID NO: 12, and conservative variations 

thereof 



Conservative variations of the above-described SMRT co-repressors 
1 5 are also contemplated to be within the scope of the present invention. Moreover, 

proteins, polypeptides and peptides having at least 80% sequence identity with any of 
the SMRT co-repressors described herein are also contemplated to be within the scope 
of the invention. 



20 In another embodiment according to the present invention, there are 

provided chimeric molecules comprising invention isolated peptides and at least a 
second molecule. For example, the second molecule in invention chimeric molecule 
can be a polynucleotide or a polypeptide. In one embodiment, the chimeric molecule 
is a fusion polypeptide comprising a SMRT co-repressor operably hnked to a DNA 

25 binding domain of a transcription factor. 

In another embodiment according to the present invention, there are 
provided isolated antibodies that bind specifically to invention isolated peptides. In 
one embodiment, an antibody of the invention binds specifically to an epitope of a 
30 SMRT co-repressor. Such an antibody is characterized, in part, in that it does not 
substantially crossreact with an N-CoR polypeptide. In another embodiment, an 
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antibody of the invention binds specifically to a complex, which includes a SMRT co- 
repressor or peptide portion thereof of the invention, a nuclear receptor and, 
optionally, a DNA regulatory element that is specifically bound by the nuclear 
receptor. Such an antibody is characterized, in part, in that it does not substantially 
5 crossreact with the nuclear receptor, either alone or bound to the DNA regulatory 
element. An antibody of the invention can be a monoclonal antibody, or can be one 
of a plurahty of polyclonal antibodies, which essentially is a mixed population of 
monoclonal antibodies. The invention also relates to a cell line, which produces the 
monoclonal antibody of the invention. 

10 

Such antibodies can be employed for a variety of purposes, e.g., for 
studying tissue localization of invention SMRT co-repressor, the structure of functional 
domains, the purification of receptors, as well as in diagnostic appHcations, therapeutic 
applications, and the like. Preferably, for therapeutic applications, the antibodies 
1 5 employed will be monoclonal antibodies. 

The above-described antibodies can be prepared employing standard 
techniques, as are well knoAvn to those of skill in the art, using the invention SMRT co- 
repressor or portions thereof as antigens for antibody production. Both anti-peptide and 

20 anti-fusion protein antibodies can be used [see, for example, Bahouth et al. ( 1 99 1 ) 
Trends Pharmacol Sci . vol. 12:338-343; Current Protocols in Mol ecular Biolosfv 
(Ausubel et al., eds.) John Wiley and Sons, New York (1989). Factors to consider in 
selecting portions of invention SMRT co-repressor for use as immunogen (as either a 
synthetic peptide or a recombinantly produced bacterial fusion protein) include 

25 antigenicity, accessibility (i.e., where the selected portion is derived from, e.g., the ligand 
binding domain, DNA binding domain, dimerization domain, and the like), uniqueness 
of the particular portion selected (relative to known receptors and co-repressors 
therefor), and the like. 

30 In another embodiment according to the present invention, there are 

provided complexes comprising an invention SMRT co-repressor and a member of 
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the nuclear receptor superfamily and isolated antibodies that bind to such complexes. 
The nuclear receptor can be in the form of a monomer or dimer, for example, a 
thyroid hormone receptor homodimer, a retinoic acid receptor homodimer, a retinoid 
X receptor homodimer, a thyroid hormone receptor-retinoid X receptor heterodimer, a 
5 retinoic acid receptor-retinoid X receptor heterodimer, a ecdysone receptor- 

Ultraspiracle receptor heterodimer, and the like. Optionally or alternatively, the 
complex can include a DNA regulatory element, bound specifically by a DNA 
binding domain of the nuclear receptor. 

1 0 The above-described complexes optionally further comprise a response 

element for the member of the nuclear receptor superfamily. Such response elements are 
well known in the art. Thus, for example, RAR response elements are composed of at 
least one direct repeat of two or more half sites separated by a spacer of five nucleotides. 
The spacer nucleotides can independently be selected from any one of A, C, G or T. 

1 5 Each half site of response elements contemplated for use in the practice of the invention 
comprises the sequence 

-RGBNNM-, 

wherein 

R is selected from A or G; 
20 B is selected from G, C, or T; 

each N is independently selected from A, T, C, or G; and 
M is selected from A or C; 
with the proviso that at least 4 nucleotides of said -RGBNNM- sequence 
are identical with the nucleotides at corresponding positions of the sequence 
25 -AGGTCA-. Response elements employed in the practice of the present invention can 
optionally be preceded by Nx, wherein x falls in the range of 0 up to 5. 



30 



Similarly, TR response elements can be composed of the same half site 
repeats, with a spacer of four nucleotides. Alternatively, palindromic constructs as have 
been described in the art are also ftmctional as TR response elements. 
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The above-described SMRT co-repressor/dimeric receptor complexes 
can be dissociated by contacting the complex with a ligand for the member of the 
nuclear receptor superfamily. 

5 As employed herein, the term "ligand (or ligand precursor) for a member 

of the nuclear receptor superfamily" (i.e., intracellular receptor) refers to a substance or 
compound which, in its unmodified form (or after conversion to its "active" form), 
inside a cell, binds to receptor protein, thereby creating a ligand/receptor complex, which 
in turn can activate an appropriate hormone response element. A ligand therefore is a 

1 0 compound which acts to modulate gene transcription for a gene maintained under the 
control of a hormone response element, and includes compounds such as hormones, 
growth substances, non-hormone compounds that modulate growth, and the like. 
Ligands include steroid or steroid-like hormone, retinoids, thyroid hormones, 
pharmaceutically active compounds, and the like. Individual Hgands may have the 

1 5 ability to bind to multiple receptors. 

Accordingly, as employed herein, "putative ligand" (also referred to as 
"test compound") refers to compounds such as steroid or steroid-like hormones, 
pharmaceutically active compounds, and the like, that are suspected to have the ability to 
20 bind to the receptor of interest, and to modulate transcription of genes maintained under 
the control of response elements recognized by such receptor. 

In another embodiment according to the present invention, there are 
provided polynucleotides encoding members of the above-described family of 
25 silencing mediators of retinoic acid and thyroid hormone receptor, or an isoform or 
peptide portion thereof (SMRT co-repressors), or an isolated polynucleotide 
complementary thereto. 

Invention polynucleotides include those encoding a SMRT co- 
30 repressor comprises a repression domain having 
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a) less than about 83% identity with a Sin3A interaction 
domain of N-CoR set forth as amino acids 255 to 312 of SEQ ED NO: 11; 

b) less than about 57% identity with repression domain 1 of 
N-CoR set forth as amino acids 1 to 312 of SEQ ID NO: 11; 

5 c) less than about 66% identity with a SANT domain of 

N-CoR set forth as amino acids 312 to 668 of SEQ ID NO: 11; or 

d) less than about 30%. identity with repression domain 2 of 
N-CoR set forth as amino acids 736 to 1031 of SEQ ID NO: 11. 

10 In addition, an invention polynucleotide can encode a mouse SMRXp 

isoform having an amino acid sequence as set forth in SEQ ID NO: 9 or conservative 
variations thereof, or a polynucleotide having a nucleotide sequence as set forth in 
SEQ ID NO: 8. 

1 5 Further examples of invention polynucleotides are those comprising a 

nucleotide sequence selected from the group consisting of: 

nucleotides 1 to 3094 of SEQ ID NO: 4; 

nucleotides 1 to 3718 of SEQ ID NO: 6; 

nucleotides 1 to 2801 of SEQ ID NO: 8; 
20 nucleotides 1 to 8388 of SEQ ID NO: 6; 

nucleotides 1 to 7465 of SEQ ID NO: 8; and 

nucleotides 1 to 8561 of SEQ ID NO: 4. 

The invention polynucleotides further comprise those encoding a 
25 human SMRT co-repressor having an amino acid sequence as set forth in SEQ ID 
NO: 5, for example, a nucleotide sequence as set forth in SEQ ID NO: 4; by a 
polynucleotide encoding a mouse SMRTa isoform having an amino acid sequence as 
set forth in SEQ ID NO: 7, for example, a nucleotide sequence as set forth in SEQ ID 
NO: 6; and by a polynucleotide encoding a mouse SMRTp isoform having an amino 
30 acid sequence as set forth in SEQ ID NO: 9, for example, a nucleotide sequence as set 
forth in SEQ ID NO: 8. A polynucleotide of the invention is further exempUfied by 
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polynucleotides encoding peptide portions of a SMRT co-repressor such as a 
polynucleotide containing nucleotides 1 to 3094 of SEQ ID NO: 4; nucleotides 1 
to 3718 of SEQ ED NO: 7; or nucleotides 1 to 2801 of SEQ ID NO: 8, which can 
repress the transcriptional activity of nuclear receptor, particularly a nuclear receptor 
5 that is in the form of dimer. 
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Additional invention polynucleotides include those encoding a full 
length insect SMRTER co-repressor having an amino acid sequence as set forth in 
SEQ ID NO: 12, or conservative variations thereof. 



Additional exemplary invention polynucleotides are those encoding 
one or both of the receptor interacting domains (ERIDl and ERID2) identified in 
invention co-repressors. For example, polynucleotides encoding such receptor 
interacting domains can be selected from those encoding the following segments of 
1 5 the Drosophiha SMRTER co-repressor (SEQ. ID 1 2): 

amino acids 1698-1924 of SEQ. ID NO: 12, 

amino acids 2951-3038 of SEQ. ID NO: 12, 

amino acids 1698-2063 of SEQ. ID NO: 12, 

amino acids 2094-3040 of SEQ. ID NO: 12, 
20 amino acids 2929-3181 of SEQ. ID NO:12, 

amino acids 542-950 of SEQ. ID NO: 12, 

amino acids 2094-3181 of SEQ ID N0:12, 

amino acids 2929-3040 of SEQ ID N0:12, and 

amino acids 2951-3038 of SEQ ID NO: 12, 
25 and conservative variations thereof. 



Additional exemplary invention polynucleotides are those encoding 
one or more of three autonomous repressor domains termed SMRDl, SMRD2, and 
SMRD3 identified in the invention co-repressors. For example, polynucleotides 
30 encoding such autonomous repressor domains can be selected fi"om those encoding 
the following segments of the Drosophilia SMRTER co-repressor (SEQ. ID 12): 
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amino acids 542-950 of SEQ. ID NO: 12 
amino acids 1698-1924 of SEQ ID N0:12, 

amino acids 2951-3038 of SEQ. ID NO:12, and conservative variations 

thereof 

5 

A polynucleotide that has at least 80% sequence identity or that 
hybridizes, (preferably under high stringency conditions) with any one of the above- 
described polynucleotides is also contemplated to be within the scope of this 
invention. 

10 

A polynucleotide of the invention can be operably linked to a second 
nucleotide sequence and, therefore, can encode a fusion polypeptide, for example, a 
SMRT co-repressor, or peptide portion thereof, operably linked to a DNA binding 
domain of a transcription factor. 

15 

Additional examples of invention isolated oligonucleotides, are those 
which generally are at least about 15 nucleotides in length and can hybridize 
specifically to the polynucleotide of the invention, but not to a polynucleotide 
encoding an N-CoR polypeptide (SEQ ID NO: 1 1). An ohgonucleotide of the 

20 invention can be useful as a probe, or as a primer for a PGR procedure, or can encode 
a peptide containing at least five contiguous amino acids of a SMRT co-repressor. In 
one embodiment, an oligonucleotide of the invention encodes at least five contiguous 
amino acids of a sequence such as that shown as amino acids 720 to 745 of SEQ ID 
NO: 5; or amino acids 716 to 742 of SEQ ID NO: 7; or amino acids 497 to 523 of 

25 SEQ ID NO: 9. In another embodiment, an oligonucleotide of the invention can 

hybridize specifically to a polynucleotide encoding human SMRT (SEQ ID NO: 5) or 
mouse SMRTa (SEQ ID NO: 7), and, optionally, to a polynucleotide encoding mouse 
SMRTp (SEQ ID NO: 9). 

30 The phrase "substantially the same" as used herein in reference to a 

nucleotide sequence of DNA, a ribonucleotide sequence of RNA, or an amino acid 
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sequence of protein, means sequences that have shght and non-consequential sequence 
variations from the actual sequences disclosed herein. Species that are substantially the 
same are considered to be equivalent to the disclosed sequences and as such are within 
the scope of the appended claims. In this regard, "sUght and non-consequential sequence 
variations" means that sequences substantially the same as the DNA, RNA, or proteins 
disclosed and claimed herein are ftinctionally equivalent to the sequences disclosed and 
claimed herein. Functionally eqixivalent sequences will flinction in substantially ftie 
same manner to produce substantially the same compositions as the nucleic acid and 
amino acid compositions disclosed and claimed herein. In particular, functionally 
equivalent DNAs encode proteins that are the same as those disclosed herein or that have 
conservative amino acid variations, such as substitution of a non-polar residue for 
another non-polar residue or a charged residue for a similarly charged residue. These 
changes include those recognized by those of skill in the art as those that do not 
substantially alter the tertiary structure of the protein. 

In another embodiment according to the present invention, there are 
provided vectors comprising an invention polynucleotide, and host cells containing 
invention polynucleotides. The invention vector can be an expression vector, 
including, for example, a viral vector, and the polynucleotide, or a vector containing 
the polynucleotide, can be contained in a host cell. In one embodiment, the 
polynucleotide of the mvention is operably linked to a tissue specific DNA regulatory 
element. In another embodiment, a SMRT co-repressor or peptide portion thereof 
encoded by the polynucleotide is expressed in a host cell. 

In another embodiment according to the present invention, there are 
provided methods for identifymg an agent that modulates the repressor potential of a 
SMRT co-repressor. In this embodiment, the invention method comprises contacting 
a host cell with an agent, and detecting a change in the level of expression of a first 
expressible nucleotide sequence in response to the agent, thereby identifying an agent 
that modulates the repressor potential of a SMRT co-repressor. In such a method, the 
host cell is characterized, in part, in that it contains a first expressible nucleotide 
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sequence operably linked to a first DNA regulatory element, and expresses a fusion 
polypeptide composed of an invention SMRT co-repressor, or peptide portion thereof, 
and a DNA binding domain of a first transcription factor that can specifically bind the 
first DNA regulatory element. Binding of the DNA binding domain of the first 
transcription factor to the first DNA regulatory element results in expression of the 
first expressible nucleotide sequence in the host cell. 

In another embodiment according to the present invention, there are 
provided methods for identifying an agent that modulates a function of an invention 
SMRT co-repressor. In this embodiment, the invention method comprises contacting 
an invention SMRT co-repressor, a member of the nuclear receptor superfamily, and 
an agent, and detecting an altered activity of the SMRT co-repressor in the presence 
of the agent as compared to the absence of the agent, thereby identifying an agent that 
modulates a function of the SMRT co-repressor. 



A method of the invention can be performed, for example, by 
contacting a host cell with an agent, and detecting a change in the level of expression 
of a first expressible nucleotide sequence in response to the agent, thereby identifying 
an agent that modulates the repressor potential of a SMRT co-repressor. In such a 

20 method, the host cell is characterized, in part, in that it contains a first expressible 

nucleotide sequence operably linked to a first DNA regulatory element, and expresses 
a fusion polypeptide composed of a SMRT co-repressor or peptide portion thereof of 
the invention, and a DNA binding domain of a first transcription factor, which can 
specifically bind the first DNA regulatory element; binding of the DNA binding 

25 domain of the first transcription factor to the first DNA regulatory element results in 
expression of the first expressible nucleotide sequence in the host cell. The first 
expressible nucleotide sequence can be an endogenous gene, which is normally 
present in the host cell, or can be a sequence that has been introduced into the host 
cell, either transiently or stably, using methods of recombinant DNA technology. In 

30 one embodiment, the first DNA binding domain is a GAL4 DNA binding domain and 
the first DNA regulatory element is a GAL4 DNA regulatory element that is operably 
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linked to an expressible nucleotide sequence, for example, a reporter gene, and is 
introduced into the host cell. 

Thus, the invention method can identify an agent that increases or 
5 decreases the repressor potential of the SMRT co-repressor, or of an agent that 

increases or decreases the function of the SMRT co-repressor. The agent can directly 
interact with the SMRT co-repressor or peptide portion thereof, thereby modulating 
the repressor potential or function of the SMRT co-repressor, or can interact with a 
cellular molecule that, in turn, can alter the repressor potential or function of a SMRT 
1 0 co-repressor, thereby mcreasing or decreasing the repressor potential of the SMRT co- 
repressor. 

The host cell can optionally contain a second expressible nucleotide 
sequence operably linked to a second DNA regulatory element, and can express a 

1 5 second fusion polypeptide, which is composed of an N-CoR polypeptide, or a 

repressor domain thereof, and a DNA binding domam of a second transcription factor, 
which can specifically bind the second DNA regulatory element. By comparing the 
level of expression of the first expressible nucleotide sequence and the second 
expressible nucleotide sequence ui the host cell upon contacting the host cell with the 

20 agent, an agent that independently or coordinately modulates SMRT and N-CoR 

repressor activity. For example, detecting a change in the level of expression of the 
first expressible nucleotide sequence, but not in the level of expression of the second 
expressible nucleotide sequence, due to contacting the host cell with the agent 
identifies an agent that modulates the repressor potential of a SMRT co-repressor, but 

25 not of an N-CoR polypeptide can be identified. 

In practicing a method of the invention, the SMRT co-repressor, or 
peptide portion thereof, can be, for example, an amino acid sequence such as amino 
acids 1 to 1031 of SEQ ID NO: 5; amino acids 1 to 1031 of SEQ ID NO: 7; or amino 
30 acids 1 to 813 of SEQ ID NO: 9. The agent can be, for example, an antibody or 
antigen binding fragment thereof, a peptide, or a small organic molecule. 
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In another embodiment according to the present invention, there are 
provided methods of modulating the transcriptional potential of a member of the 
nuclear receptor superfamily (nuclear receptor) in a cell, the method comprising 
introducing an invention isolated polynucleotide into the cell, whereby the 
polynucleotide or an expression product of the polynucleotide alters the level of a 
SMRT co-repressor in the cell, thereby modulating the transcriptional potential of the 
nuclear receptor. 

In another embodiment according to the present invention, there are 
provided methods of modulating the transcriptional potential of a member of the 
nuclear receptor superfamily (nuclear receptor) in a cell, the method comprising 
introducing an invention isolated polynucleotide into the cell, whereby the 
polynucleotide or an expression product of the poljmucleotide ahers the level of a 
SMRT co-repressor in the cell, thereby modulating the transcriptional potential of the 
nuclear receptor. 

In performing a method of the invention, an agent that alters an 
interaction of the SMRT co-repressor, or peptide portion thereof, with the nuclear 
receptor can be identified using a binding assay, such as an electrophoretic mobility 
shift assay wherein the level of expression of an expressible nucleotide sequence. 
Such a method can also identify an agent that alters the ability of the invention SMRT 
co-repressor, or peptide portion thereof, to interact specifically with the nuclear 
receptor, but does not alter the level of expression of the expressible nucleotide 
sequence; or an agent that alters the level of expression of the expressible nucleotide 
sequence, but does not alter interaction of the SMRT co-repressor or peptide portion 
thereof with the nuclear receptor; or an agent that alters an interaction of the SMRT 
co-repressor, or peptide portion thereof, with the nuclear receptor and alters the level 
of expression of the expressible nucleotide sequence. The agent can, but need not be, 
a hgand for the nuclear receptor, and the method can be performed in a cell or in a 
reaction mixture in vitro. 
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Alternatively, an invention polynucleotide can be introduced into the 
cell, whereby the polynucleotide, or an expression product of the polynucleotide, 
alters the level of a SMRT co-repressor in the cell, thereby modulating the 
5 transcriptional potential of the nuclear receptor. The polynucleotide can encode an 
invention SMRT co-repressor or peptide, portion thereof, which can be expressed in 
the cell, thereby increasing the level of a SMRT co-repressor, or peptide portion 
thereof, in the cell. The polynucleotide also can be an antisense polynucleotide, that 
decreases the level of a SMRT co-repressor in the cell. 

10 

In another embodiment according to the present invention, there are 
provided methods of identifying a molecule that interacts specifically with a SMRT 
co-repressor. In this embodiment, invention methods comprise contacting the 
molecule with an invention SMRT co-repressor and detecting specific binding of the 
1 5 molecule to the SMRT co-repressor, thereby identifying a molecule that interacts 
specifically with a SMRT co-repressor. 

The molecule can be any molecule that interacts specifically with a 
SMRT co-repressor, including, for example, a small organic molecule such as a drug, 
20 a peptide, a nucleic acid molecule, and the like. In one embodiment, the molecule is a 
cellular factor, for example, a cellular protein that modulates the ability of a SMRT 
co-repressor to repress transcriptional activity of a nuclear receptor. In another 
embodiment, the method fiuther involves isolating the molecule that interacts 
specifically with the SMRT co-repressor or peptide portion thereof. 

25 

In accordance with yet another aspect of the present invention, there are 
provided methods to block the repressing effect of invention SMRT co-repressors, said 
method comprising administering an effective amount of an antibody as described 
herein. Alternatively, a silencing domain of a nuclear receptor can be employed. Those 
30 of skill in the art can readily determine suitable methods for administering said 

antibodies, and suitable quantities for administration, which will vary depending on 
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numerous factors, such as the indication being treated, the condition of the subject, and 
the like. 

In accordance with another aspect of the present invention, there is 
5 provided a method to repress (or silence) the activity of a member of the nuclear receptor 
superfamily containing a silencing domain that represses basal level promoter activity of 
target genes, said method comprising contacting said member of the nuclear receptor 
superfamily with a sufficient quantity of an invention SMRT co-repressor so as to 
repress the activity of said member. Members of the nuclear receptor superfamily 
1 0 contemplated for repression in accordance with this aspect of the present invention 
include, for example, thyroid hormone receptor, retinoic acid receptor, vitamin D 
receptor, peroxisome proliferator activated receptor, and the like. 

In accordance with yet another aspect of the present invention, there is 
1 5 provided a method to identify compounds which relieve the repression of nuclear 

receptor activity caused by an invention SMRT co-repressor, said method comprising 
comparing the size of the SMRT co-repressor/dimeric receptor complex (i.e., complexes 
comprising the invention SMRT co-repressor and a homodimeric or heterodimeric 
member of the nuclear receptor superfamily) upon exposure to test compound, relative to 
20 the size of said complex in the absence of test compound. An observed size 

corresponding to intact complex is indicative of an inactive compound, while an 
observed size that reflects dissociation of the complex is indicative of a compound that 
disrupts the complex, thereby relieving the repression caused thereby. Optionally, the 
complex employed in this assay further comprises a response element for said member 
25 of the nuclear receptor superfamily. 

The size of the above-described complex can readily be determined 
employing various techniques available in the art. For example, electrophoretic mobility 
shift assays (EMSA) can be employed (wherein receptor alone or receptor-SMRT co- 
30 repressor complex is bound to target DNA and the relative mobility thereof determined). 
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Those of skill in the art can readily identify other methodology which can be employed 
to determine the size of the complex as a result of exposure to putative ligand. 

In accordance with a still fiirther aspect of the present invention, there is 
5 provided a method to identify compounds which reUeve the repression of nuclear 
receptor activity caused by an invention SMRT co-repressor, without substantially 
activating said receptor, said method comprising: 

comparing the reporter signal produced by two different expression 
1 0 systems in the absence and presence of test compound, 

wherein said first expression system comprises a complex 
comprising: 

a homodimeric or heterodimeric member of the nuclear 
receptor superfamily selected firom thyroid hormone receptor 
1 5 homodimer, thyroid hormone receptor-retinoid X receptor 



heterodimer, retinoic acid receptor homodimer, or retinoic acid 
receptor-retinoid X receptor heterodimer, 
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a response element for said member of the nuclear 
receptor superfamily, wherein said response element is 
operatively linked to a reporter gene, and 



optionally, invention SMRT co-repressor, and 



wherein said second expression system comprises a complex 



comprising: 
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a homodimeric or heterodimeric form of the same 



member of the nuclear receptor superfamily as employed in said 
first expression system, wherein said member is mutated such 



that it retains hormone dependent activation activity but has lost 
its ability to repress basal level promoter activity of target genes. 
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the same response element-reporter combination as 



employed in said first expression system, and 
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optionally, invention SMRT co-repressor, and thereafter 
selecting those compounds which provide: 

a higher reporter signal upon exposure of said compound 
to said first expression system, relative to reporter signal in the 
5 absence of said compound, and 

substantially the same reporter signal upon exposure of 
said compound to said second expression system, relative to 
reporter signal in the absence of said compound. 



1 0 wherein said selected compounds are capable of relieving the repression 

of nuclear receptor activity caused by a SMRT co-repressor having a structure and 
function characteristic of an invention SMRT co-suppressor but substantially lacking the 
abihty to activate nuclear receptor activity. 

1 5 The addition of invention SMRT co-repressor is optional in the above- 

described assay because it is present endogenously in most host cells employed for such 
assays. It is preferred, to ensure the presence of a fairly constant amount of SMRT co- 
repressor, and to ensure that SMRT co-repressor is not a limiting reagent, that SMRT co- 
repressor be supplied exogenously to the above-described assays. 

20 

Mutant receptors contemplated for use in the practice of the present 
invention are conveniently produced by expression plasmids, introduced into the host 
cell by transfection. Mutant receptors contemplated for use herein include RAR403 
homodimers, RAR403 -containing heterodimers, TR160 homodimers, TR160-containing 
25 heterodimers, and the like. 

Reporter constructs contemplated for use in the practice of the present 
invention comprise: 

(a) a promoter that is operable in the host cell, 
30 (b) a hormone response element, and 
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(c) a DNA segment encoding a reporter protein, 

wherein the reporter protein-encoding DNA segment is 
operatively linked to the promoter for transcription of the DNA 
segment, and 

5 wherein the hormone response element is operatively 

linked to the promoter for activation thereof. 

Hormone response elements contemplated for use in the practice of the 
present invention are well known in the art, as has been noted previously. 

10 

Exemplary reporter genes include chloramphenicol transferase (CAT), 
luciferase (LUC), beta-galactosidase (P -gal), and the like. Exemplary promoters include 
the simian virus (SV) promoter or modified form thereof (e.g., SV), the thymidine kinase 
(TK) promoter, the mammary tumor virus (MTV) promoter or modified form thereof 
1 5 (e.g., AMTV), and the like [see, for example, Mangelsdorf et al., in Nature 345:224-229 
(1990), Mangelsdorf et al., m Cell 66:555-561 (1991), and Berger et al., in J. Steroid 
Biochem. Molec. Biol. 41:733-738 (1992). 

As used herein in the phrase "operative response element" or 
20 "operatively linked" the word "operative" means that the respective DNA sequences 
(represented by the terms "GAL4 response elemenf and "reporter gene") are 
operational, i.e., work for their intended purposes; such that after the two segments are 
linked, upon appropriate activation by a ligand-receptor complex, the reporter gene will 
be expressed as the result of the fact that the "GAL4 response element" was "turned on" 
25 or otherwise activated. 

In practicing the above-described fimctional bioassay, the expression 
plasmid and the reporter plasmid are co-transfected into suitable host cells. The 
transfected host cells are then cultured in the presence and absence of a test compound to 
30 determine if the test compound is able to produce activation of the promoter operatively 
linked to the response element of the reporter plasmid. Thereafter, the transfected and 
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cultured host cells are monitored for induction (i.e., the presence) of the product of the 
reporter gene sequence. 

Any cell line can be used as a suitable "host" for tiie functional bioassay 
5 contemplated for use in the practice of the present invention. Thus, cells contemplated 
for use in the practice of the present invention include transformed cells, non- 
transformed cells, neoplastic cells, primary cultures of different cell types, and the like. 
Exemplary cells which can be employed in the practice of the present invention include 
Schneider cells, CV-1 cells, HuTuSO cells, F9 cells, NTERA2 cells, NB4 cells, HL-60 

1 0 cells, 293 cells, Hela cells, yeast cells, and the like. Preferred host cells for use in the 
functional bioassay system are COS cells and CV-1 cells. COS-1 (referred to as COS) 
cells are monkey kidney cells that express SV40 T antigen (Tag); while CV-1 cells do 
not express SV40 Tag. The presence of Tag in the COS-1 derivative lines allows the 
introduced expression plasmid to replicate and provides a relative increase in the amount 

1 5 of receptor produced during the assay period. CV-1 cells are presently preferred because 
they are particularly convenient for gene transfer studies and provide a sensitive and 
well-described host cell system. 

The above-described cells (or fiactions thereof) are maintained under 
20 physiological conditions when contacted with physiologically active compound. 

"Physiological conditions" are readily understood by those of skill in the art to comprise 
an isotonic, aqueous nutrient medium at a temperature of about 37°C. 

In accordance with yet another aspect of the present invention, there is 
25 provided a method to identify compounds which activate nuclear receptor activity, but 
substantially lack the abihty to relieve the repression caused by an invention SMRT co- 
repressor, said method comprising: 

comparing the reporter signal produced by two different expression 
30 systems in the absence and presence of test compound. 
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wherein said first expression system comprises a complex 
comprising: 

a homodimeric or heterodimeric member of the nuclear 
receptor superfamily selected firom thyroid hormone receptor 
5 homodimer, thyroid hormone receptor-retinoid X receptor 

heterodimer, retinoic acid receptor homodimer, or retinoic acid 
receptor-retinoid X receptor heterodimer, 

a response element for said member of the nuclear 
receptor superfamily, wherein said response element is 
1 0 operatively linked to a reporter, and 

optionally, invention SMRT co-repressor, and 

wherein said second expression system comprises a complex 
comprising: 

15 a homodimeric or heterodimeric form of the same 

member of the nuclear receptor superfamily as employed in said 
first expression system, wherein said member is mutated such 
that it retains hormone dependent activation activity but has lost 
its ability to repress basal level promoter activity of target genes, 

20 the same response element-reporter combination as 

employed in said first expression system, and 

optionally, invention SMRT co-repressor, and thereafter 

selecting those compounds which provide: 
25 a higher reporter signal upon exposure of said compound to said 

second expression system, relative to reporter signal in the absence of 

compound, and 

substantially the same reporter signal upon exposure of said 

compound to said first expression system, relative to reporter signal in 
30 the absence of said compound. 
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wherein said selected compounds are capable of activating nuclear 
receptor activity, but substantially lacking the ability to relieve the repression caused by 
a SMRT co-repressor having a structure and function characteristic of, an invention 
SMRT co-repressor for retinoic acid and thyroid receptors. 

In accordance with a still further aspect of the present invention, there is 
provided a method to identify compounds which reheve the repression of nuclear 
receptor activity caused by an invention SMRT co-repressor, and activate said receptor, 
said method comprising: 



comparing the reporter signal produced by two different expression 
systems in the absence and presence of test compound, 

wherein said first expression system comprises a complex 
comprising: 

15 a homodimeric or heterodimeric member of the nuclear 

receptor superfamily selected from thyroid hormone receptor 
homodimer, thyroid hormone receptor-retinoid X receptor 
heterodimer, retinoic acid receptor homodimer, or retinoic acid 
receptor-retinoid X receptor heterodimer, 

20 a response element for said member of the nuclear 

receptor superf^nily, wherein said response element is 
operatively linked to a reporter, and 

optionally, invention SMRT co-repressor, and 



25 wherein said second expression system comprises a complex 

comprising: 

a homodimeric or heterodimeric form of the same 
member of the nuclear receptor superfamily as employed in said 
first expression system, wherein said member is mutated such 
30 that it retains hormone dependent activation activity but has lost 

its ability to repress basal level promoter activity of target genes, 
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the same response element-reporter combination as 
employed in said first expression system, and 

optionally, invention SMRT co-repressor, and thereafter 

5 selecting those compounds which provide: 

increased reporter signal upon exposure of said compound to said 
second expression system, relative to reporter signal in the absence of 
said compound, and 

substantially increased reporter signal upon exposure of said 
1 0 compound to said first expression system, relative to reporter signal in 

the absence of said compound, 

wherein said selected compounds are capable of relieving the repression 
of nuclear receptor activity caused by a SMRT co-repressor having a structure and 
1 5 function characteristic of the silencing mediator for retinoic acid and thyroid receptors, 
and activating said receptor. 

hi accordance with still another embodiment of the present invention, 
there are provided modified forms of the above-described SMRT co-repressor, 
20 including: 

full length silencing mediator for retinoic acid and thyroid receptors plus 

GAL4 DNA binding domain, 
full length silencing mediator for retinoic acid and thyroid receptors plus 

GAL4 activation domain, 
25 full length silencing mediator for retinoic acid and thyroid receptors plus 

glutathione S-transferase (GST) tag, 
and the like. 



The above-described modified forms of invention SMRT co-repressor 
30 can be used in a variety of ways, e.g., in the assays described herein. 
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An especially preferred modified SMRT co-repressor of the invention 
comprises &11 length silencing mediator for retinoic acid and thyroid receptors plus 
GAL4 activation domain. 



5 In accordance with a still fiirther embodiment of the present invention, 

there is provided a method to identify compounds which disrupt the ability of an 
invention SMRT co-repressor to complex with nuclear receptors, without substantially 
activating said receptor, said method comprising: 

1 0 comparing the reporter signal produced by two different expression 

systems in the absence and presence of test compoimd, 

wherein said first expression system comprises a complex 
comprising: 

a modified SMRT co-repressor as described above, 
15 a homodimeric or heterodimeric member of the nuclear 

receptor superfamily selected fi-om thyroid hormone receptor 
homodimer, thyroid hormone receptor-retinoid X receptor 
heterodimer, retinoic acid receptor homodimer or retinoic acid 
receptor-retinoid X receptor heterodimer, and 
20 a response element for said member of the nuclear 

receptor superfamily, wherein said response element is 
operatively linked to a reporter, and 



wherein said second expression system comprises a complex 
25 comprising: 

said modified SMRT co-repressor, 
a homodimeric or heterodimeric form of the same 
member of the nuclear receptor superfamily as employed in said 
first expression system, wherein said member is mutated such 
30 that it retains hormone dependent activation activity but has lost 
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its ability to repress basal level promoter activity of target genes, 
and 

the same response element-reporter combination as 
employed in said first expression system, and thereafter 

5 

selecting those compounds which provide: 

a lower reporter signal upon exposure of said compound to said 
first expression system, relative to reporter signal in the absence of said 
compound, and 

1 0 substantially the same reporter signal upon exposure of said 

compound to said second expression system, relative to reporter signal in 
the absence of said compound, 

wherein said selected compounds are capable of disrupting the ability of 
15 a SMRT co-repressor having a structure and function characteristic of the silencing 
mediator for retinoic acid and thyroid receptors to complex with nuclear receptors, 
without substantially activating said receptor. 

Mutant receptors contemplated for use in this embodiment of the present 
20 invention include RAR403 homodimers, RAR403-containing heterodimers, TRl 60 
homodimers, TR160-containing heterodimers, and the Uke. 

Suitable host cells for use in this embodiment of the present invention 
include mammalian cells as well as yeast cells. Yeast cells are presently preferred 
25 because they introduce no backgroimd since SMRT (i.e., silencing mediator (SMRT co- 
repressor) for retinoic acid receptor (RAR) and thyroid hormone receptor (TR)) is not 
endogenous to yeast. 

In accordance with yet another embodiment of the present invention, 
30 there is provided a method to identify compounds which activate nuclear receptor 
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activity, but substantially lack the ability to disrupt a complex comprising a nuclear 
receptor and an invention SMRT co-repressor, said method comprising: 

compaing the reporter signal produced by two different expression 
5 systems in the absence and presence of test compound, 

wherein said first expression system comprises a complex 
comprising: 

a modified SMRT co-repressor as described above, 
a homodimeric or heterodimeric member of ttie nuclear 
1 0 receptor superfamily selected fi-om thyroid hormone receptor 

homodimer, thyroid hormone receptor-retinoid X receptor 
heterodimer, retinoic acid receptor homodimer or retinoic acid 
receptor-retinoid X receptor heterodimer, and 

a response element for said member of the nuclear 
1 5 receptor superfamily, wherein said response element is 

operatively linked to a reporter, and 



wherein said second expression system comprises: 
said modified SMRT co-repressor, 

20 a homodimeric or heterodimeric form of the same 

member of the nuclear receptor superfamily as employed in said 
first expression system, wherein said member is mutated such 
that it retains hormone dependent activation activity but has lost 
its ability to repress basal level promoter activity of target genes, 

25 and 

the same response element-reporter combination as 
employed in said first expression system, and thereafter 
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selecting those compounds which provide: 

a higher reporter signal upon exposure of said compound to said 
second expression system, relative to reporter signal in the absence of 
compound, and 

5 substantially the same reporter signal upon exposure of said 

compound to said first expression system, relative to reporter signal in 
the absence of compoimd. 



wherein said selected compounds are capable of activating nuclear 
1 0 receptor activity, but substantially lack the ability to disrupt the complex of an invention 
SMRT co-repressor. 

Suitable host cells for use in this embodiment of the present invention 
include mammalian cells as well as yeast cells. Yeast cells are presently preferred 
1 5 because they introduce no background since SMRT is not endogenous to yeast. 



In accordance with a still fiirther embodiment of the present invention, 
there is provided a method to identify compoimds which activate a nuclear receptor, and 
disrupt the ability of an invention SMRT co-repressor to complex with said receptor, 
20 said method comprising: 

comparing the reporter signal produced by two different expression 
systems in the absence and presence of test compound, 

wherein said first expression system comprises a complex 
25 comprising: 

a modified SMRT co-repressor as described above, 
a homodimeric or heterodimeric member of the nuclear 
receptor superfamily selected fi-om thyroid hormone receptor 
homodimer, thyroid hormone receptor-retinoid X receptor 
30 heterodimer, retinoic acid receptor homodimer or retinoic acid 

receptor-retinoid X receptor heterodimer, and 
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a response element for said member of the nuclear 
receptor superfamily, wherein said response element is 
operatively linked to a reporter, and 



5 wherein said second expression system comprises a complex 

comprising: 

said modified SMRT co-repressor, 

the same homodimeric or heterodimeric member of the 

nuclear receptor superfamily as employed in said first expression 
1 0 system, wherein said member is mutated such that it retains 

hormone dependent activation activity but has lost its ability to 

repress basal level promoter activity of target genes, and 

the same response element-reporter combination as 

employed in said first expression system, and thereafter 

15 

selecting those compounds which provide: 

a reduction in reporter signal upon exposure of compound to said 
first expression system, relative to reporter signal in the absence of said 
compound, and 

20 increased reporter signal upon exposure of compound to said 

second expression system, relative to reporter signal in the absence of 
said compound. 



wherein said selected compounds are capable of activating a nuclear 
25 receptor and disrupting a complex comprising nuclear receptor and a SMRT co- 
repressor having a structure and fimction characteristic of the silencing mediator for 
retinoic acid and thyroid receptors. 
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Suitable host cells for use in this embodiment of the present invention 
include mammalian cells as well as yeast cells. Yeast cells are presently preferred 
because they introduce no background since SMRT is not endogenous to yeast. 
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In accordance with yet another aspect of the present invention, there is 
provided a method to identify compounds which activate a nuclear receptor and/or 
disrupt the ability of an invention SMRT co-repressor to complex with said receptor, 
5 said method comprising: 



comparing the reporter signals produced by a combination expression 
system in the absence and presence of test compound, 

wherein said combination expression system comprises: 
10 a first homodimeric or heterodimeric member of the 

nuclear receptor superfamily selected fi-om thyroid hormone 
receptor homodimer, thyroid hormone receptor-retinoid X 
receptor heterodimer, retinoic acid receptor homodimer, or 
retinoic acid receptor-retinoid X receptor heterodimer, 
15 a second homodimeric or heterodimeric form of the same 

member of the nuclear receptor superfamily as employed in said 
first homodimer or heterodimer, wherein said member is mutated 
such that it retains hormone dependent activation activity but has 
lost its ability to repress basal level promoter activity of target 
20 genes (i.e., provides basal level expression), 

wherein either said first homodimer (or 
heterodimer) or said second homodimer (or heterodimer) 
is operatively linked to a GAL4 DNA binding domain. 



25 a response element for said member of the nuclear 

receptor superfamily, wherein said response element is 
operatively linked to a first reporter. 
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a GAL4 response element, wherein said response element 
is operatively linked to a second reporter, and 
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optionally a SMRT co-repressor of nuclear receptor 
activity, said SMRT co-repressor having a structure and function 
characteristic of the silencing mediator for retinoic acid and 
thyroid receptors, and thereafter 

5 

identifying as capable of relieving the repression of nuclear receptor 
activity caused by a SMRT co-repressor having a structure and function characteristic of 
the silencing mediator for retinoic acid and thyroid receptors, but substantially lacking 
the ability to activate nuclear receptor activity those compounds which provide: 
10 a higher reporter signal from the reporter responsive to the first 

member upon exposure of said compound to said first member, relative 
to reporter signal in the absence of said compound, and 

substantially the same reporter signal from the reporter 
responsive to the second member upon exposure of said compound to 
1 5 said second member, relative to reporter signal in the absence of said 

compound, or 

identifying as capable of activating nuclear receptor activity, but 
substantially lacking the abihty to relieve the repression caused by a SMRT co-repressor 
20 having a structure and function characteristic of the silencing mediator for retinoic acid 
and thyroid receptors those compounds which provide: 

a higher reporter signal from the reporter responsive to the second 
member upon exposure of said compound to said second member, 
relative to reporter signal in the absence of compound, and 
25 substantially the same reporter signal from the reporter 

responsive to the first member upon exposure of said compound to said 
first member, relative to reporter signal in the absence of said compound, 
or 

30 identifying as capable of relieving the repression of nuclear receptor 

activity caused by a SMRT co-repressor having a structure and function characteristic of 



38 

the silencing mediator for retinoic acid and thyroid receptors, and activating said 
receptor those compounds which provide: 

a higher reporter signal from the reporter responsive to the second 
member upon exposure of said compound to said second member, 
5 relative to reporter signal in the absence of said compound, and 

a greater increase in reporter signal from the reporter responsive 
to the first member upon exposure of said compound to said first 
member, relative to reporter signal in the absence of said compound. 

1 0 Thus, the change in expression level of the two different reporters 

introduced in a single transfection can be monitored simultaneously. Based on the 
results of this single transfection, one can readily identify the mode of interaction of test 
compound with the receptor/SMRT complex. 

1 5 Exemplary GAL4 response elements are those containing the 

palindromic 17-mer: 

5'-CGGAGGACTGTCCTCCG-3' (SEQ ID NO:3), 

20 such as, for example, 17MX, as described by Webster et al., in Cell 52: 169-178 (1988), 
as well as derivatives thereof Additional examples of suitable response elements 
include those described by HoUenberg and Evans in Cell 55:899-906 (1988); or Webster 
et al. in Cell 54:199-207 (1988). 

25 In accordance with still another embodiment of the present invention, 

there is provided a method to identify compounds which activate a nuclear receptor 
and/or disrupt the ability of an invention SMRT co-repressor to complex with said 
receptor, said method comprising: 
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comparing the reporter signals produced by a combination expression 
system in the absence and presence of test compound. 
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wherein said combination expression system comprises: 

a modified SMRT co-repressor as described above, 
a first homodimeric or heterodimeric member of the 
nuclear receptor superfamily selected firom thyroid hormone 
receptor homodimer, thyroid hormone receptor-retinoid X 
receptor heterodimer, retinoic acid receptor homodimer, or 
retinoic acid receptor-retinoid X receptor heterodimer, 

a second homodimeric or heterodimeric form of the same 
member of the nuclear receptor superfamily as employed in said 
first homodimer or heterodimer, wherein said member is mutated 
such that it retains hormone dependent activation activity but h^ 
lost its abiUty to repress basal level promoter activity of target 
genes, 

wherein either said first homodimer (or 
heterodimer) or said second homodimer (or heterodimer) 
is operatively linked to a GAL4 DNA binding domain, 

a response element for said member of the nuclear 
receptor superfamily, wherein said response element is 
operatively linked to a first reporter, 

a GAL4 response element, wherein said response element 
is operatively linked to a second reporter, and thereafter 

identifying as capable of disrupting the ability of a SMRT co-repressor 
having a structure and fimction characteristic of the silencing mediator for retinoic acid 
and thyroid receptors to complex with a nuclear receptor, without substantially activating 
nuclear receptor, those compounds which provide: 

a lower reporter signal flrom the reporter responsive to the first 
member upon exposure of said compound to said first member, relative 
to reporter signal in the absence of said compound, and 
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substantially the same reporter signal from the reporter 
responsive to the second member upon exposure of said compound to 
said second member, relative to reporter signal in the absence of said 
compound, or 

identifying as capable of activating nuclear receptor activity, but 
substantially lacking the ability to disrupt a complex comprising a nuclear receptor and a 
SMRT co-repressor having a structure and function characteristic of the silencing 
mediator for retinoic acid and thyroid receptors, those compounds which provide: 

a higher reporter signal from the reporter responsive to the second 
member upon exposure of said compound to said second member, 
relative to reporter signal in the absence of compound, and 

substantially the same reporter signal from the reporter 
responsive to the first member upon exposure of said compound to said 
first member, relative to reporter signal in the absence of said compound, 
or 

identifying as capable of disrupting a complex comprising a nuclear 
receptor and a SMRT co-repressor having a structure and fimction characteristic of the 
silencing mediator for retinoic acid and thyroid receptors, and activating said receptor 
those compoimds which provide: 

a reduction in reporter signal from the reporter responsive to the 
first member upon exposure of said compound to said first member, 
relative to reporter signal in the absence of said compound, and 

increased reporter signal from the reporter responsive to the 
second member upon exposure of said compound to said second 
member, relative to reporter signal in the absence of said compoimd. 

In accordance with a still further aspect of the present invention, there is 
provided a method to identify compounds which relieve the repression of nuclear 
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receptor activity caused by an invention SMRT co-repressor, said method comprising 
determining the effect of adding test compovind to an expression system comprising: 

a modified member of the nuclear receptor superfamily, wherein said 
modified member contains an activation domain which renders said receptor 
constitutively active, 

a fusion protein comprising the receptor interaction domain of SMRT 
operatively linked to the GAL4 DNA binding domain, and 

a GAL4 response element operatively linked to a reporter. 

Prior to addition of an effective ligand for the member of the nuclear 
receptor superfamily employed herein, the association of the modified member and the 
fiision protein will be effective to bind the GAL4 response element and activate 
transcription of the reporter. The presence of an effective ligand is indicated by a 
reduction of reporter signal upon exposure to ligand, which disrupts the interaction of the 
modified member and fiision protein. 

Activation domains contemplated for use in the practice of the present 
invention are well known in the art and can readily be identified by the artisan. 
Examples include tiie GAL4 activation domain, BP64, and the like. 

To summarize, a novel family of nuclear receptor SMRT co-repressor 
which mediates tiie tiranscriptional silencing of RAR and TR has been identified. This 
discovery is of great interest because transcriptional silencing has been shown to play an 
important role in development, cell differentiation and the oncogenic activity of v-erbA 
(Baniahmad et al., EMBOJ. 11:1015-1023 (1992)); Gandrillon et al.. Cell 49:687-697 
(1989)); Zenke et al., Cell 61:1035-1049 (1990); Barlow et al., EMBOJ. 13:4241-4250 
(1994); Levine and Manley, Cell 59:405-408 (1989); Baniahmad et al., Proc. Natl. Acad. 
Sci. USA 89:10633-10637 (1992b); and Saitou et al.. Nature 374:159-162 (1995)). In 
fact, v-erbA mutants that harbor the Prol60->Arg change in the TR neither repress basal 
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transcription nor are capable of oncogenic transformation (Damm and Evans, (1993), 
supra). 

The function of SMRT as a silencing mediator (co-repressor) of RAR 
and TR is analogous to mSin3 in the Mad-Max-Sin3 ternary complex (Schreiber-Agus et 
al., Ce// 80:777-786 (1995); and Ayer et al, Cell 80:767-116 (1995)). Because 
GAL-SMRT functions as a potent repressor when bound to DNA, it is reasonable to 
speculate that the function of the unliganded receptors is to bring with them SMRT to 
the template via protein-protein interaction. Thus, the repressor fimction is intrinsic to 
SMRT as opposed to the TR or RAR itself (Baniahmad et al., Proc. Natl. Acad. Sci. USA 
90:8832-8836 (1993); and Fondell et al., Genes Dev 7:1400-1410 (1993)). It is 
demonstrated herein that the ligand triggers a dissociation of SMRT from the receptor, 
which would lead to an initial step in the activation process. This would be followed (or 
be coincident) with an induced conformational change in the carboxy-terminal 
transactivation domain ( c , also called AF2), allowing association with co-activators 
on the transcription machinery (Douarin et al, EMBO J. 14:2020-2033 (1995); 
Halachmi et al., Science 264:1455-1458 (1994); Lee et al.. Nature 374:91-94 (1995); and 
Cavailles et al., Proc. Natl. Acad. Sci. USA 91:10009-10013 (1994)). Thus, as has 
previoxisly been suggested (Damm and Evans, (1993), supra), the Ugand dependent 
activation of TR would represent two separable processes including relief of repression 
and net activation. The isolation of SMRT now provides a basis for dissecting the 
molecular basis of trans-repression. 

The invention will now be described in greater detail by reference to the 
following non-limiting examples. 

Example 1 
Isolation of SMRT 

Using a GAL4 DBD-RXR fusion protein (see, for example, USSN 
08/177,740, incorporated by reference herein in its entirety) as a bait in a yeast 
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two-hybrid screening system (Durfee et al., (1993), supra), several cDNA clones 
encoding receptor interacting proteins were isolated. One of these proteins, SMRT, 
interacts strongly with vmhganded RAR and TR but only weakly with RXR or other 
receptors in yeast. This protein was selected for fiirther characterization. 

Example 2 
Far-western blot ting procedure 

Total bacteria extracts expressing GST fusions of hRARa (aa 156-462) 
or hRXRa LBD (aa 228-462) and control extracts expressing GST alone or GST-PML 
fusion protein were subjected to SDS/PAGE and electroblotted onto nitrocellulose in 
transfer buffer (25 mM Tris, pH 8.3/ 192 mM glycine/ 0.01% SDS). After 
denaturation/renaturation from 6 M to 0.187 M guanidine hydrochloride in HB buffer 
(25 mM HEPES, pH 7.7/25 mM NaCy5 mM MgCVl mM DTT) filters were saturated 
at 4°C in blocking buffer (5% milk, then 1% milk in HB buffer plus 0.05% NP40). In 
vitro translated ^^S-labeled proteins were diluted into H buffer (20 mM Hepes, pH 7.7/75 
mM KCl/0.1 mM EDTA/2.5 mM MgCl2/0.05% NP40/ 1% milk/l mM DTT) and the 
filters were hybridized overnight at 4°C with (1 jiM) or without ligand. After three 
washes with H buffer, filters were dried and exposed for autoradiography or quantitated 
by phosphoimager. 

GST-SMRT is a GST fusion of the C-SMRT encoded by the yeast two 
hybrid clone. GST-SMRT has been purified, but contains several degradation products. 

For yeast two-hybrid screening, a construct expressing the GAL4 
DBD-hRXRa LBD (aa 198-462) fusion protein was used to screen a human lymphocyte 
cDNA library as described purfee et al, (1993), supra). Full length SMRT cDNA was 
isolated from a human HeLa cDNA library (Clontech) using the two-hybrid insert as a 
probe. 
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Using the above-described far-western blotting procedure, ^^S-labeled 
SMRT preferentially complexes with bacterial extracts expressing the RAR, marginally 
associates with RXR md shows no association with control extracts. In contrast, 
^^S-PPAR selectively associates with its heterodimeric partner, RXR, but not with RAR. 
hi a similar assay, ^^S-labeled RAR or TR interacts strongly with SMRT and their 
heterodimeric partner, RXR, but not with degraded GST products, while ^^S-RXR 
interacts only weakly with SMRT. Binding of hgand to RAR or TR reduces their 
interactions with SMRT but not with RXR, while binding of hgand to RXR has only 
slight effect. Figure 1 shows the quantitation of a dose-dependent dissociation of SMRT 
from RAR or TR by ail-trans retinoic acid (atRA) or thyroid hormone (triiodothyronine 
or T3), demonstrating that the amount of hgand required for 50% dissociation in both 
cases are close to the kds for both ligands (Munoz et al. EMBOJ. 7:155-159 (1988); Sap 
et al.. Nature 340:242-244 (1989); and Yang et al., Proc. Natl. Acad. Sci. USA 
88:3559-3563 (1991)). 

Full length SMRT encodes a polypeptide of 1495 amino acids rich in 
proline and serine residues (see Figure 2 and SEQ ID NO: 1). Genbank database 
comparison reveals similarity of the C-terminal domain of SMRT to a partial cDNA 
encoding another receptor interacting protein, RIP 13 (Seol et al., (1995), supra), whose 
role in receptor signaling is unknown. Within this region, there can be identified several 
potential heptad repeats which might mediate protein-protein interaction with the 
"a-hehcal sandwich" structure (Bourguet et al., Nature 375:377-382 (1995)) of the 
ligand binding domain (LBD) of receptors. 

Example 3 
Characterization of SMRT 

Unlike other nuclear receptors, unliganded RAR and TR possess a strong 
silencing domain which represses basal level promoter activity of their target genes 
(Damm et al.. Nature 339:593-597 (1989); Brent et al.. New Biol. 1:329-336 (1989); 
Baniahmad et al.. Cell 61:505-514 (1990); andBaniahmad et al., EMBOJ. 
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11 : 1015-1023 (1992)). The preferential interaction of SMRT with RAR and TR in the 
absence of hormone suggests that SMRT may play a role in mediating the transcriptional 
silencing effect of the receptor. 

To further investigate the involvement of SMRT in silencing, the 
interaction of SMRT with mutant receptors which display distinct silencing and/or 
transactivation activities was tested as follows. ^^S -methionine labeled receptors were 
used as probes to hybridize immobilized GST-SMRT in the presence (10 |iM) or 
absence ot all-trans retinoic acid (atRA). The total bacteria extract expressing 
GST-RXR was included as a control. 

When quantitated by phosphoimager, RAR403 shows a 4-fold better 
interaction with SMRT than wild type RAR. Both full length RAR or a deletion mutant 
expressing only the Ugand binding domain (LBD, referred to as AAR) associate with 
SMRT; this association is blocked by Hgand. 

These results confirm that the LBD alone is sufficient in the interaction. 
The carboxy-terminal deletion mutant RAR403 is a potent dominant negative repressor 
of basal level promoter activity of RAR target genes (Damm et al., Proc. Natl. Acad. Sci. 
USA 90:2989-2993 (1993); Tsai and Collins, Proc. Natl. Acad. Sci. USA 90:7153-7157 
(1993); and Tsai et al.. Genes Dev 6:2258-2269 (1992)). As might be predicted fi-om the 
above studies, RAR403 and its amino terminal deletion derivative, R403, interact 
strongly with SMRT in either the presence or absence of ligand, consistent with SMRT 
mediating the repressor activity of this mutant. 

Example 4 
Tnteraction of SMRT with TR Mutants 

The interaction of SMRT with two different classes of TR mutants was 
analyzed next. The first mutant employed is the naturally occurring oncogene, v-erbA, 
which has strong silencing ability but no transactivation activity (Sap et al., (1989), 
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supra; Sap et al, Nature 324:635-640 (1986); Weinberger et aL, Nature 318:670-672 
(1985); and Weinberger et al.. Nature 324:641-646 (1986)). The second mutant 
employed is a single amino acid change (Pro 160 -> Arg) of the rTRa (TR160) which 
has previously been shown to lose its capacity in basal level repression but retains 
hormone dependent transactivation (Thompson et al.. Science 237:1610-1614 (1987); 
and Damm and Evans, Proc. Natl. Acad. Sci. USA 90:10668-10672 (1993)). If SMRT is 
involved in silencing, it would be expected that SMRT should interact with the v-erbA, 
but show little or no association with the silencing-defective TR160 mutant. 

Interaction of the oncogenic v-erbA and rTRa R160 mutant (TR160) 
with GST-SMRT was determined in a far-western assay as described above (see 
Example 2). When quantitated by phosphoimager, the v-erbA shows an 1 8-fold better 
interaction with SMRT than hTRp, and the TR160 mutant shows a 10-fold lower signal 
than the rTRa. 

As one might expect, v-erbA interacts strongly with SMRT both in 
presence or absence of ligand. hi contrast, fiill length TR160 mutant or LBD of TR160 
(AATR160) does not interact significantly with SMRT when compared to the wild type 
receptor. 

These data demonstrate that SMRT plays an important role in mediating 
transcriptional silencing effects of both RAR and TR. These data also suggest that the 
release of SMRT from receptors could be a prerequisite step in ligand-dependent 
transactivation by nuclear receptors. 

Example 5 

FormatioTi of ternary com plexes containing SMRT 

RAR and TR form heterodimers with RXR, resulting in a complex with 
high DNA binding ability (Bugge et al., EMBOJ. 11:1409-1418 (1992); Yu et al.. Cell 
67:1251-1266 (1991); and Kliewer et al., Nature 355:446-449 (1992)). Since SMRT 
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interacts with RAR and TR, tests were conducted to determine whether SMRT can also 
interact with the receptor-DNA complex. Thus, the interaction of SMRT with 
RXR-RAR heterodimer on a DR5 element (i.e., an AGGTCA direct repeat spaced by 
five nucleotides) was determined in a gel retardation assay, which is carried out as 
5 follows. In vitro translated receptor or unprogrammed reticulocyte lysate (URL) was 
incubated with 1 fig of poly dIdC on ice for 15 minutes in a total volume of 20 |j,l 
containing 75 mM KCl, 7.5% glycerol, 20 mM Hepes (pH 7.5), 2 mM DTT and 0.1% 
NP-40, with or without ligand (in the range of about 10-100 nM employed). A ^^P 
labeled, double stranded oligonucleotide probe was added into the binding reaction 
10 (1 0,000 cpm per reaction), and the reaction was further incubated for 20 minutes at room 
temperature. The protein-DNA complex was separated on a 5% native polyacrylamide 
gel at 150 volts. 

SMRT is seen to form a ternary complex with the RXR-RAR heterodimer on a DNA 
1 5 response element in the gel retardation assay. Addition of ligand releases SMRT fi:om 
this complex in a dose-dependent manner. 

Similarly, SMRT is seen to form a ternary complex with the RXR-TR 
heterodimer on a TR response element; addition of T3 disrupts the formation of this 
20 complex. 

These data demonstrate that SMRT can be recruited to DNA response 
elements via protein-protein interaction with RAR or TR in the absence of hormone. 
Binding of hormone disrupts receptor-SMRT interaction and releases SMRT fi-om the 
25 receptor-DNA complex. 

Example 6 
Transient transfection assay 

30 CV- 1 cells were plated in 24 well plates at a density of 50,000 cells per 

well. Expression plasmids were transfected into cells by lipofection using DOTAP. In 
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each transfection, 5 ng of GAL-RAR and 15 ng of v-erbA or SMRT were used together 
with 150 ng of reporter construct containing 4 copies of GAL4 binding sites in front of a 
minimal thymidine kinase promoter and a CMX-p-gal construct as an internal control. 
The relative luciferase activity was calculated by normalizing to the P-gal activity. 

5 

Example 7 
Reversal of transcriptional silencing 

Recently, it has been shown that over expression of RAR or TR could 
1 0 reverse the transcriptional silencing effect of the GAL4 DBD fusion of TR (GAL-TR) or 
RAR (GAL-RAR) (Baniahmad et al., Mol Cell Biol 15:76-86 (1995); and Casanova et 
al., Mol Cell Biol 14:5756-5765 (1994)), presumably by competition for a limiting 
amount of a SMRT co-repressor. A similar effect is observed herein when over 
expression of v-erbA or RAR403 mutants are shown to reverse the silencing effect of 
1 5 GAL-RAR and GAL-TR on the basal activity of a luciferase reporter (see Figure 3 A and 
3B). 

In principle, over expression of SMRT should restore repressor activity 
when co-expressed with v-erbA or RAR403 competitors. Indeed, results presented in 

20 Figure 3C show that both the full length and the C-terminal domain of SMRT 
(C-SMRT) can titrate out v-erbA or RAR403 competitor activity and re-endow 
GAL-RAR and GAL-TR with silencing activity. In contrast, neither v-erbA nor SMRT 
show dxiy effect on the transactivation activity of GAL-VP16 fusion. Thus, SMRT is 
able to block the titration effect of v-erbA and RAR403 and functionally replaces the 

25 putative SMRT co-repressor in this system. 

Example 8 

Direct recruitment of SMRT to a heterologous promoter 

30 If SMRT is the mediator of transcription silencing of TR and RAR by 

interaction with template-bound unliganded receptors, then direct recruitment of SMRT 
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to a heterologous promoter should result in repression of basal level activity. This was 
tested by fusing full length SMRT to the GAL4 DBD (GAL-SMRT). The effect of the 
resulting fusion protein on the activity of the thymidine kinase promoter containing four 
GAL4 binding sites was analyzed. Figure 3D shows that GAL-SMRT, hke GAL-TR, 
5 can silence basal promoter activity in a dose-dependent manner. In contrast, GAL-RXR 
shows no repression. 

These data suggest that SMRT, when recruited to a promoter by direct 
DNA binding or via association with an unliganded receptor, functions as a potent 
1 0 transcriptional repressor. 



Example 9 

Cloning Of Human And Mouse SMRT co-repressors 



1 5 This example describes the cloning of a fiiU length human silencing 

mediator of retinoic acid and thyroid hormone receptor (SMRT co-repressor) and of two 
mouse SMRT isoforms, m-SMRTa andm-SMRTp . 



An examination of the previously described human SMRT co-repressor 
20 revealed that the first eight amino acids and upstream sequences were derived from a 
portion of ribonucleoprotein K sequence. Accordingly, a mouse spleen cDNA lambda 
ZAP II library (Stratagene; La JoUa CA) was screened at low stringency with a probe 
corresponding to approximately the 5' 1,000 base pairs (bp) of the previously identified 
human SMRT (s-SMRT). A 3.5 kilobase (kb) cDNA fragment was obtained that 
25 contained a unique sequence in addition to known s-SMRT sequence. The 5' end of this 
cDNA, and subsequently obtained clones, was used in successive roimds of screening of 
the mouse spleen cDNA library and a mouse brain cDNA hbrary (Stratagene) and the 
fiill-length SMRTa isoform cDNA (SEQ ID NO: 6) and SMRTa isoform cDNA (SEQ 
ID NO: 10) were obtained. The mouse SMRT (m-SMRT) 5' sequence then was used at 
30 low stringency to screen a human pituitary cDNA hbrary (Stratagene) to obtain the fiill- 
length human SMRT (h-SMRT) cDNA (SEQ ID NO: 1). All cDNA clones were 
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sequenced on both strands using standard methods, and have been deposited with 
GenBank as Accession No. AF103003 (h-SMRT; SEQ ID NOS: 3 and 5); Accession 
No. 1 13001 (m-SMRTa ; SEQ ID NOS: 6 and 7); and Accession No. 1 13002 (m- 
SMRTp ; SEQ ID NOS: 8 and 9). 

5 

By sequentially shifting between the mouse spleen and mouse brain 
cDNA libraries, several clones containing a potential starting methionine and 5' 
untranslated region sequences were obtained. The complete polypeptide sequences of 
m-SMRT (SEQ ID NO: 7) and h-SMRT (SEQ ID NO: 5) are provided, hi addition, a 

1 0 splice variant isolated from the mouse brain cDNA library encoded an m-SMRT co- 

repressor containing a deletion of amino acids 36 to 254 of SEQ ID NO: 7 (see SEQ ED 
NO: 3). The two m-SMRT co-repressors are designated SMRTa (SEQ ID NO: 7) and 
SMRTP (SEQ ID NO: 9). Based on sequence similarity to N-CoR (see below), this 
deletion in m-SMRT p removes the majority of the sequence in h-SMRT and m-SMRTa 

1 5 that is homologous to N-CoR repression domain 1 (RDl), including a portion of the 
Sin3A binding region. 

The cloned h-SMRT (SEQ ED NO: 3) encodes a polypeptide that 
contains an additional 1 130 amino acids at the amino terminus as compared to the 

20 previously described hiunan SMRT co-repressor. The full length h-SMRT shares 84% 
identity with m-SMRTa . A comparison of h-SMRT (SEQ ID NO: 5) and N-CoR (SEQ 
ID NO: 1 1) revealed that the N-terminal extension of h-SMRT (amino acids 1 to 1030) 
andN-CoR (amino acids 1 to 1031) share approximately 41% identity, which is 
somewhat higher that the 36% identity shared between the full length proteins. 

25 However, regions within the N-CoR and SMRT N-termini share striking homology 
(Figures 4A and 4B). 

Amino acids 1 to 160 of N-CoR are moderately conserved in h-SMRT 
(and m-SMRTa ), sharing about 36% identity. This region of N-CoR has been reported 
30 to interact with Siah2 (Zhang et al., ( 1998), supra) and, similarly, can be involved in an 
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interaction of Siah2 with h-SMRT or m-SMRTa . In particular, highly conserved 
sequences in this region can be the specific Siah2 interaction sites (see Figure 4A). 

A 52 amino acid segment fi-om N-CoR (amino acids 255 to 312) 
5 mediates an interaction with SinSA (Heinzel et al., Nature 387:43-48 (1 997)), and was 
presumed to represent the core of the larger RDl region (Horlein et al., (1995), supra). 
This small interaction domain is highly conserved (about 83% identity) in h-SMRT, and 
the overall identity shared between SMRT and N-CoR RDl is about 57%. 

1 0 Amino acids 312 to 668 of N-CoR also are well conserved (66%i 

identity) in h-SMRT (and m-SMRTa ), and two internal blocks of sequences in this 
region share even greater similarity (see Figure IB; shaded regions). These blocks are 
homologous to each other and to part of the SANT domain, which was identified in the 
yeast chromatin remodeUng factor, SWI3, the yeast adapter protein, ADA2, the basal 

1 5 transcription factor TFillB, and other proteins (Aasland et al.. Trends Biochem. Sci. 
21 :87-88 (1996)), suggesting that these domains share a common and important 
fimction. The amino acids of N-CoR RD2 (see Horlein et al., (1995) supra) are the least 
conserved in h-SMRT, sharing about 30% identity. 

20 These results demonstrate that isoforms of SMRT co-repressors are 

expressed in cells, as exemplified by m-SMRTa and m-SMRTp . In addition, the 
results demonstrate that the previously undescribed amino terminus of SMRT co- 
repressors shares regions of substantial homology with N-CoR, and regions of homology 
are identified that indicate these sequences can mediate previously mcharacterized 

25 fimctions. 



Example 10 

Expression And Chromosomal Localization Qf Smrt Co-Repressors 



30 



This example describes the tissue distribution of SMRT RNA and the 
chromosomal localization of human SMRT. 
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Total RNA was prepared from adult CB6F1 mouse tissues using 
TRIZOL reagent (GIBCO/BRL), and poly(A) RNA was purified from total RNA using 
an OLIGOTEX mRNA Kit (Qiagen, Valencia, CA). RNA was separated on 1 .25% 
5 agarose/6% formaldehyde gels and transferred to a NYTRAN membrane (Scheicher & 
Schuell). A 720 bp m-SMRT/PstI fragment was used as a probe. Following 
hybridization with the SMRT probe, the filters were stripped and hybridized with a 
murine glyceraldehyde-3-phosphate dehydrogenase cDNA probe to allow normalization 
for RNA loading. 

10 

Chromosomal locaUzation of SMRT was determined by fluorescence in 
situ hybridization using the 5.3 kb h-SMRT cDNA clone. The probe was labeled by 
nick-franslation with biotin-ll-dUTP, then hybridized to normal male human metaphase 
chromosomes. Chromosomes were counterstained with 4',6-diamidino-2-phenylindole 
1 5 (DAPI). Chromosome identification was carried out by computer inversion of the gray 
scale DAPI image on a PSI hnaging System (Perceptive Scientific Instruments; League 
City TX). Chromosome 12 confirmation was carried out using a chromosome 12- 
specific alpha satellite probe (Vysis; Downers Grove IL). 

20 Previous studies using the short human SMRT co-repressor suggested 

that SMRT was expressed ubiquitously in various tissues. To confirm this result, 
expression of the fiili length m-SMRT was determined by northern blot analysis by 
using a probe consisting of nucleotides 2760 to 3620 of m-SMRT (SEQ ID NO: 6). The 
expression pattern was ubiquitous, as previously described, although higher levels were 

25 detected in lung, spleen, and brain. Similarly, h-SMRT was expressed ubiquitously as 
determined using a multiple tissue blot (CLONTECH; Palo Alto CA). It is noteworthy 
that two isoforms of SAiRT were present in the majority of the mouse tissues and Ukely 
correspond to the m-SMRTa and m-SMRTp isoforms. 

30 The chromosomal location of the h-SMRT and N-CoR genes was 

mapped. The h-SMRT clone hybridized to the q arm of one of the C group 
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chromosomes. Computer-mediated banding of the DAPI stained chromosomes 
identified the labeled chromosome as chromosome 12, band q24. The chromosome 12 
localization was confirmed by cohybridization of SMRT and a chromosome 12 alpha 
satellite probe, D12Z3 (Vysis), which labels the pericentromeric region of chromosome 
5 12. The location for the human N-CoR gene was determined through a mapped human 
bacterial artificial chromosome clone, hCIT529I10, which is 158 kb of genomic N-CoR 
and resides on chromosome 1 Ip 1 1 .2. The SMRT and N-CoR chromosomal locations 
can be accessed through GENEMAP98 from the Human Genome Project at 
http://www.ncbi.nlm.nih.gov/genemap. 

10 

These results demonstrate that the full length SMRT co-repressors and 
the SMRT co-repressors are expressed in various tissues. The results also demonstrate 
that the human SMRT gene is located on chromosome 12. 



15 Example 11 

Functional Characterization Of SMRT 
Amino Terr 



This example demonstrates that various domains of the SMRT amino 
20 terminus can repress nuclear receptor transcriptional activity. 

Experiments were performed using the plasmids pCMX-GAL4 DBD and 
pMHlOO-TK-luc (Nagy et al, (1997), supra). Standard PGR amplifications were used 
to generate GAL4 fusion constructs. All constructs were verified by double-stranded 
25 sequencing to confirm identity and reading frame. 



Monkey CV-1 cells were grown in DMEM supplemented with 1 0% 
resin-charcoal stripped fetal bovine serum (FBS), 50 units/ml of penicillin G, and 50 
|xg/ml of streptomycm sulfate at 37°C in 7% CO2. V-1 cells (60-70% confluence, 48- 
30 well plate) were cotransfected with 16 ng of pCMX-GAL4, 100 ng of pMHlOO-TK-luc, 
and 100 ng of pCMX-|3 galactosidase in 200 )li1 of DMEM containing 10% super- 
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stripped fetal calf serum (FCS) by the N-(l-(2,3-ciioleoyloxy)propyl)-N,N,N- 
trimethylammonium methylsulfate (DOTAP)-mediated procedure (Nagy et al., (1997), 
supra). The amount of DNA in each transfection was kept constant by addition of 
pCMX. After 24 hr, the medium was replaced; cells were harvested and assayed for 
luciferase activity 36 to 48 hr after transfection. Luciferase activity was normalized by 
the level of p-galactosidase activity. Each transfection was performed in tripUcate and 
repeated at least three times. 

Based on the high degree of identity between regions of the SMRT 
amino terminus and the correspondmg N-CoR region, the ability of regions in the SMRT 
amino terminus to act in transcriptional repression was examined. A nested series of 
nucleotide sequences encoding portions of the SMRT amino terminus fused to the 
GAL4 DNA bindmg domain (GAL-DBD) was prepared in mammahan expression 
vectors (Figure 5A). The constructs were cotransfected with a GAL4-TK-luciferase 
reporter plasmid to determine the regulatory properties of the GAL4-SMRT fusions. 
Repression was determined relative to the basal activity of the reporter in the presence of 
the GAL-DBD alone. 

The entire SMRT amino terminus region (GAL4-SMRT(1-1031)) 
demonstrated the greatest amount of repression (approximately 38-fold), and virtually 
extinguished reporter activity, hi comparison, GAL4-SMRT (1-303), which is 
equivalent to N-CoR RDl, demonstrated 6-fold repression; and GAL4-SMRT (736- 
103 1), which is equivalent to N-CoR RD2, demonstrated about 2.6-fold repression. 
Surprisingly, the highly conserved SANT domain conferred a significant amount of 
repression (about 3. 3 -fold). 

A smaller region (amino acids 845 to 986) within the RD2 homology 
region shows a higher level of sequence conservation as compared to the entire RD2 
region. Deletion constructs were generated to determine whether this minimal region 
was sufficient for the repression activity of RD2. Deletion of flanking amino acids 736 
to 845 or of amino acids 987 to 1055 did not affect the level of repression, demonstrating 
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that the repressor function of RD2 is contained within a 141 amino acid core sequence of 
RD2. 

Based on sequence similarity to N-CoR, the deletion of amino acids 36 to 
5 254 in the m-SMRXP isoform removes the majority of RDl , including a portion of the 
Sin3A binding region. The effect of this deletion on SMRT fiinction was examined by 
cotransfection experiments comparing repression by SMRTa to SMRXp . These 
experiments revealed that SMRTp has substantially less repressor activity than SMRTa 
. A construct containing the entire amino terminus of m-SMRTp (amino acids 1 to 813) 
1 0 repressed activity about 2.6 fold, as compared to m-SMRTa amino acids 1 to 1031, 
which repressed activity about 38. 1 -fold. In addition, a GAL4 construct containing m- 
SMRT amino acids 1 to 83 repressed activity only about 1.4-fold. These results indicate 
that alternative splicing can add fiirther diversity to expand the function of SMRT gene 
products. 

15 

Example 12 
Yeast Two-Hybrid Screen and Assays 

To investigate whether repression by EcR in CV-1 cells is mediated by 
20 its association with a vertebrate corepressor and whether such an interaction, if it does 
occur, is impaired by the A483T mutation, a mammalian two-hybrid assay with Gal4- 
c-SMRT was conducted. 

A yeast two-hybrid screen (Fields and Song, Nature, 340:245-246, 
25 (1989)) was performed by transforming approximately 2 X 10^ Y190 yeast cells with 
a pAS-EcR construct and a Drosophila (0-8 hr) embryonic c-DNA two-hybrid library 
(Yu et al.. Nature, 385:552-555, (1997)). Transformants were selected onto DO-Leu- 
Trp-His plates containing 40 mM 3-aminotriazole (Sigma) for 3-4 days. Surviving 
yeast colonies were picked as primary positives and restreaked on selection plates to 
30 isolate single clones. Activation domain plasmids were rescued from the selected 
positive transformants for fiirther analysis. Each clone was evaluated by testing its 
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potential interaction with several other nuclear receptors using the yeast two-hybrid 
assays. E52 was isolated and further pursued based on this selection criterion. 
Quantitative liquid assay of p-galactosidase was performed on positive clones 16 hr 
after treating the yeast cells with no hgand, or with 3 ^iM ligand. 

pAS-EcR is a fusion gene with the region corresponding to amino 
acids 223-878 of EcRBl fused C-terminally to the Gal4-DBD of the pASl-CYH2 
construct (Durfee et al., (1993), supra); other Gal4-DBD-based nuclear receptor 
constructs used in this yeast two-hybrid assay include: USP (amino acids 50-508), 
hRAR (amino acids 186-462) and hTR (amino acids 121-410) (Schuhnan et al., 
(1995), supra), and SMRT (Chen and Evans, (1995), supra), p-galactosidase 
activities were quantified by liquid assay for yeast cells treated either without ligand 
or with 3 |j,M of corresponding hormone. All-trans retinoic acid (ATRA) is a hgand 
of RAR; 3,3',5-triiodothyroacetic acid (T3) is a hgand of TR. 

Similar yeast two-hybrid assays were also used to examine the 
mteraction between SMRTER and mSinSA and dSin3A. 

Example 12 
Clonin g SMRTER 

To isolate full-length SMRTER cDNA, a Xhol insert fragment isolated 
fi-om the E52 clone was used to screen male and female Tudor c-DNA libraries (gift 
of Tulle Hazelrigg). This initial screen resulted in isolating three overlapping c-DNA 
clones covering the region of amino acid 2094 to the C terminus of SMRTER 
Additional regions were obtained from three consecutive library screens using two 
cosmid clones isolated from the Tamkun genomic library (gift of John Tamkun). 
Sequences of these overlapping c-DNA and genomic clones were assembled to obtain 
a conceptual open reading frame of SMRTER 3446 amino acids in length (SEQ ID 
NO: 12; Figure 8A). The franslational initiation codon was designated based on the 
sequences that match the consensus Kozak codons and is preceded by three in-frame 
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consecutive stop codons in the upstream region. Both strands of the sequences of the 
c-DNA clones were determined using an ABI prism Big Dye® terminator cycle 
sequencing ready reaction kit (PE Biosystems) and ABI 377 instrument. 



Example 14 
Plasmids 



CMV promoter-driven expression plasmids of EcR, USP, RXR, 
1 0 c-SMRT, p-galactosidase, and pMHl 00-TK-luc reporter, and yeast plasmids of RAR, 
TR, and SMRT have been described previously (Yao et al., (1992), supra, Yao et al., 
(1993), supra; Chen and Evans, (1995), supra; Schulman et al., (1995), supra; Chen 
et al., Proc. Natl. Acad. Sci. USA 93:7567-7571, (1996); Nagy et al., (1997), supra). 
hsp27EcR-TK-Luc, a reporter with six copies of the hsp27EcRE, is a gift of Barry 
15 Forman. CMV vector-driven EcR A483T and Gal4-SMRD3 mutations were 

generated using the Transformer® site-directed mutagenesis kit (Clontech) with 
proper selection primers and the mutagenic primers that correspond to the missense 
mutation (A483T) of EcR and to the designated mutations, M2 and M3, in the 
SMRD3 domain, respectively. Other plasmids were constructed with standard 
20 techniques, including various enzyme digestions or PCR amphfication. 



Example 15 
Cell Cultur e and Transfection 



25 CV-1 cells were grown in Dulbecco's modified Eagles medium at 

37°C in 5% CO2. The media were supplemented with 10% AG1-X8 resin charcoal 
double-stripped calf bovine serum, 50 U/ml penicillin G, and 50 |Ag/ml streptomycin 
sulfate. Approximately 20 hr after CV-1 cells (10^ cells) were plated in 48-well cell 
culture clusters (Costar), cells were transiently transfected with plasmids using 

30 DOTAP according to the instructions of the manufacturer (Boehringer Mannheim). 
The amount of CMV promoter-driven expression vectors, P-galactosidase gene 
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expression vector, CMX-lacZ, and reporter, pMHlOO-TK-luc or hsp27EcRE-TK-Luc, 
were in the range of 100-200 ng, 500 ng, and 400 ng, respectively, for six wells of 
each 48-well clusters in each transfection experiments. At least 4 hr after 
transfection, each medium was replaced with medium either without Ugand, or with 1 
5 [iM of Mur A. Cells were harvested and assayed approximately 48 hr after 

transfection. All experiments were performed in triphcate and repeated with similar 
results. 

CV-1 cells were transfected with wild-type EcR or EcR A483T, along 
1 0 with vp 1 6-USP and a reporter, hsp27EcRE-TK-Luc, which contains six copies of the 
hsp27EcRE fused to the thymidine kinase (TK) promoter-luciferase reporter. VP16- 
USP fusion contains the region of US? (amino acids 50-508) fiised C-terminally to 
the VP 16 domain. Muristerone A (Mur A) is a potent ecdysone agonist 
(Christopherson et al., Proc. Natl. Acad. Sci. USA, 89:6314-6318, (1992)). In all 
1 5 experiments, cells were also cotransfected with CMV-lacZ, which is used to 

normaUze the luciferase activity. As shown in Figure 6A, the ability to dunerize with 
USP is reflected in reporter activity without treatment with hormone (open bar), and 
the ability to respond to hormone is reflected in reporter activity when cells were 
treated with 1 ^iM Muristerone A (closed bar). 

20 

CMV promoter-driven expression vector including wild-type EcR or 
EcR A483T was cotransfected with VP16-USP and Gal4-c-SMRT (amino acids 981 
to C terminus) (Chen and Evans, (1995), supra) into CV-1 cells to examine its effect 
on the interaction with vertebrate corepressor. All cells were also cotransfected with a 
25 TK-luciferase reporter construct, pMHlOO-TK-Luc, containing four copies of the 
yeast Gal4-responsive element. EcR A483T corresponds to a smgle amino acid 
change (alanine-»threonine) at the 483 site of EcR (Bender et al., (1997), supra). The 
results of this experiment (Figure 6B) show that EcR A483T disrupts the mteraction 
with SMRT. 



30 
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Rxample 16 
Tn Vitro Inte racting Assays 

Glutathione S-transferase fusion proteins, including GST-X, GST- 
5 ERID 1 (amino acids 1698-2063 of SEQ ID NO: 1), and GST-ERID2 (amino acids 
2951-3038 of SEQ ID NO:l), were expressed in E. coli DH5 cells, and extracts were 
affinity purified by binding to glutathione Sepharose 4B beads. Bound proteins used 
as affinity matrices in pull-down experiments were first equilibrated with the binding 
buffer (20 mM HEPES [pH 7.9], 150 mM NaCl, 1 mM EDTA, 4 mM MgC12, 1 mM 

1 0 DTT, 0.06% NP40, 10% Glycerol, 0.25 mM PMSF, 1 mg BSA). For pull-down 

assays using GST-ERIDl (amino acids 1698-2063 of SEQ ID N0:1) and GST-ERID2 
(amino acids 2951-3038 of SEQ ID N0:1), additional hsp27EcP^ (0.05 |ig/ml) was 
added to the binding buffer. In this experiment, 30 ^,1 of 50% GST-protein beads 
slurry, containing approximately 1 ^g of proteins, were incubated with 10 jil of 35S- 

1 5 methionine-labeled proteins in 300 |il of the binding buffer (with or without 3 jxM of 
MurA as indicated) for 30 min at room temperature. After the incubation, beads were 
washed three times with the binding buffer (with or without ligand) and resuspended 
in SDS-PAGE sample buffer before loading. After electrophoresis, bound radio- 
labeled proteins were visualized by autoradiography. 35S-methionine-labeled EcR, 

20 USP were generated in a coupled transcription-translation system, TNT (Promega), 
using CMX-EcR (T7) and CMX-uspK (T7) constructs as templates, respectively. 

Example 17 

Immunohistochemistry and Immun ofluorescence 

25 

Antibodies against SMRTER were raised in rabbits immunized with 
bacterially expressed glutathione S-transferase fusion proteins corresponding to the 
region (amino acids 2477-2648 of SEQ ID N0:1) of SMRTER. Specific antibodies 
were purified by affinity chromatography through antigen-linked columns and used at 
30 1 :200 dilution for tissue staining. Tissues for whole-mount staining were dissected at 
the wandering third instar stage of the Canton-S strain larvae and fixed (4% 
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formaldehyde in 1? PBS, 50 mM EGTA) for at least 30 min. Preincubation, 
secondary antibodies, washes, and peroxidase reactions are described in the protocol 
of the Elite ABC (Rabbit IgG) kit (Vector). For the pilot experiments, partially 
purified IgG from preimmunization serum was used. For polytene chromosome 
5 staining, salivary glands were dissected according to the method described in Zink et 
aUEMBOJ., 10:153-162, (1991). 

Chromosome spreads were costained with affinity-purified anti- 
SMRTER (1 : 100) polyclonal antibody and with anti-USP monoclonal antibody 
1 0 (ABIII/AD5 ; gift of F . Kafatos, 1 : 1 00 dilution). SMRTER was detected with Texas 
red-conjugated donkey anti-rabbit secondary antibody (1 : 100 dilution), and USP was 
detected with FITC-conjugated donkey anti-mouse secondary antibody (1:100 
dilution) (Jackson ImmunoResearch Labs). 

15 Example 18 

ER Interacts Genetically with DSinA 

In keeping with the evidence that dSin3 A is a component in EcR 
regulatory pathway, an experiment was conducted to examine whether dSin3 A 

20 interacts genetically with EcR using several previously characterized Drosophila EcR 
and dSin3A mutants (Bender et al., (1997), supra; Neufeld et al., (1998), supra). In 
the experiment, in which female dSin3AK07401 were crossed with male EcRE261st 
using techniques known in the art (see Table 1 below), only approximately 14% of the 
scored EcRE261st/dSin3AK07401 progenies survived, a percent that is significantly 

25 lower than the expected 33.3%. This suggests that a large portion of the 

EcRE261st/dSin3AK07401 flies either die prior to eclosion or fail to eclose. 
Additionally, surviving EcRE261st/dSin3AK07401 escapers showed delayed 
development and wing defects, in which wings are held horizontally at 45°-90° angle 
from the body axis. These results suggest that dSin3A shares an overlapping 

30 regulatory pathway with EcR. 
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In a reverse genetic cross, in which female EcRE261st were crossed 
with male dSin3AK07401, none of the EcRE261st/dSin3AK07401 flies survived to 
adulthood. These results suggest that EcRE261st/dSin3AK07401 results in a 
genetically sensitized background. When the maternally deposited EcR in embryos 
5 descended from female EcRE261st/SM6b was cut in half, the lethality for 

EcRE261st/dSin3AK07401 was further increased. These results reveal that, in 
addition to its previously known zygotic function, EcR also contributes maternally to 
Drosophila development. 

10 Table 1 



Table 1. EcR Interacts Genetically with DSinSA 








Cross 




Svirviving Rate (%) 


DSin3A^^^^"VCyO 
X 

EcR^^^'VSM6b 


? 

s 


14 (n= 141) 


EcR^^''VSM6b 






X 

DSin3A^o'^°VCyO 


s 


0 (n = 144) 

i-._T,E261st/T»o:„'n Axe574 



A similar wing held-out phenotype is also observed in EcR^''°'^^/DSin3A''^^''^, 
Df(2R)napl l/DSin3A^°^''"\ and Df(2R)napl l/Dsin2A'''^'^ EcR^^^'^* and 



Df(2R)napl 1 are both described in Figure 6. Dsin2A^°^'^°^ is an allele with a P 
element insertion within the 5' intron of Sin3A. DSin3A'^®^^'^ is an X ray-generated 
allele (Neufeld et al., (1998)). n=the number of surviving flies scored. Note that 
CyO/SM6b is lethal. 



EcRA483T flies showed developmental abnormalities in wings and tergites. 

15 A similar phenotype, although with a lower penetration rate, has been also observed 
in EcRA483T/Df(2R)20B and in EcRA483T/Df(2R)napll. Df(2)20B and 
Df(2)napl 1 are both deficiencies in which EcR is deleted (Bender et al., (1997), 
supra). Sequence alignment of EcR with the vertebrate TR, RAR, and v-erbA, an 
oncogenic TR variant, revealed that alanine 483 is located within a highly conserved 

20 23-amino acid (aa) loop region connecting helices 3 and 4, termed the LED signature 
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motif (Wurtz et al., Nat. Struct. Biol, 3:206, (1996)) (see Figure 6C). Based on 
structural studies of vertebrate nuclear receptors (for review, see Moras and 
Gronemeyer, (1998), supra), this alanine residue appears to be on the exposed 
surface, consistent with it being a potential corepressor binding site for nuclear 
5 receptors. 

These in vivo studies indicate that EcRA483T is a semilethal allele 
(Bender et al., (1997), supra). When EcRA483T is in trans with EcRE261st, an allele 
that removes both the DBD and LBD domains of EcR, animals are primarily lethal 

1 0 (>95%). The few surviving EcRA483T/EcRE261st flies, however, display significant 
delays in development, blistered wings, and defective tergites, indicating that EcR is 
involved in the development of these tissues. The abiUty of EcR to bind a vertebrate 
corepressor and the loss of this property in EcR A483T suggests that the defects 
observed in EcRA483T flies may result from the disruption of its interaction with an 

1 5 as yet unidentified Drosophila corepressor. 

Example 19 
Isolation of an RcR-lnter acting Factor 

20 The CMV promoter-driven expression vector including wild-type EcR 

or EcR A483T, was cotransfected with vpl6-USP and Gal4-c-SMRT (amino acids 
981 to C terminus) (Chen and Evans, (1995), supra ) into CV-1 cells to examine its 
effect on the interaction of the invertebrate SMRTER with vertebrate corepressor. All 
cells were also cotransfected with a TK-luciferase reporter construct, pMHlOO-TK- 

25 Luc, containing four copies of the yeast Gal4-responsive element. EcR A483T 

corresponds to a single amino acid change (alanine^threonine) at the 483 site of EcR 
(Bender et al., (1997), supra). Although EcR readily interacted with SMRT in both 
mammalian and yeast cells (Figure 6B; Figure 7), repeated low-stringency 
hybridization screens failed to identify a Drosophila homolog of SMRT. No 

30 SMRT/N-CoR homolog was found in C. elegans. 
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Example 20 
Isolation and Characterization of an 
EcR-Intera cting Clone - Yeast Tw o-hybrid screen 

5 To pursue the isolation of an EcR corepressor, a yeast two hybrid 

interaction screen was performed of a Drosophila embryonic cDNA library using 
pAS-EcR as bait. E52 was isolated as one of the complementary positive clones from 
a yeast two-hybrid screen with pAS-EcR as bait, as described in Example 12. 

10 Example 21 

Characterization of a Repression-Defective EcR Allele. EcRA483T 

(A) CV-1 cells were transfected with wild-type EcR or EcR A483T, along 
with vpl6-USP and a reporter, hsp27EcRE-TK-Luc, which contains six copies of the 

1 5 hsp27EcRE fused to the thymidine kinase (TK) promoter-luciferase reporter. In all 
experiments, cells were also cotransfected with CMV-lacZ, which is used to 
normalize the luciferase activity. The ability to dimerize with USP was reflected in 
reporter activity without treatment with hormone (open bar), and the ability to 
respond to hormone was reflected in reporter activity when cells were treated with 1 

20 |j.M Mviristerone A (closed bar). vpl6-USP fusion contains the region of USP (amino 
acids 50-508) fused C-terminally to the vpl6 domain. Muristerone A (MurA) is a 
potent ecdysone agonist (Christopherson et al., (1992), supra). In these tests EcR 
A483T was selectively defective in repression. 

25 (B) CMV promoter-driven expression vector including wild-type EcR 

or EcR A483T was cotransfected with vpl6-USP and Gal4-c-SMRT (amino acids 981 
to C terminus) (Chen and Evans, (1995), supra) into CV-1 cells to examine its effect 
on the interaction with vertebrate corepressor. All cells were also cotransfected with a 
TK-luciferase reporter construct, pMHlOO-TK-Luc, containing four copies of the 

30 yeast Gal4-responsive element. EcR A483T corresponds to a single amino acid 

change (alanine threonine) at the 483 site of EcR (Bender et al., (1997), supra). The 
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results of this test show that EcR A483T disrupts the interaction with SMRT. 

(C) Sequence ahgnment of EcR with the vertebrate TR, RAR, and v- 
erbA, an oncogenic TR variant, reveals that the alanine 483 of the EcRA4831T 
5 mutant is located within a highly conserved 23 -amino acid (aa) loop region 

connecting helices 3 and 4, termed the LBD signature motif (Wurtz et al., (1996), 
supra) (Figure 6C). Based on structural studies of vertebrate nuclear receptors (for 
review, see Moras and Gronemeyer, (1998), supra), this alanine residue appears to be 
on the exposed surface, consistent with it being a potential corepressor binding site for 
1 0 nuclear receptors. 

In vivo studies indicated that EcRA483T is a semilethal allele (Bender 
et al., (1997), supra). When EcRA483T is in trans with EcRE261st, an allele that 
removes both the DBD and LBD domains of EcR, animals are primarily lethal 

1 5 (>95%). The few surviving EcRA483T/EcRE261st flies, however, display significant 
delays in development, blistered wings, and defective tergites, indicating that EcR is 
involved in the development of these tissues. The ability of EcR to bind a vertebrate 
corepressor and the loss of this property in EcR A483T suggested to us that the 
defects observed in EcRA483T flies may result from the disruption of its interaction 

20 with an as yet imidentified Drosophila corepressor. 

Example 22 
Isolation of an EcR-Interactina Factor 

25 Although EcR readily interacts with SMRT in both mammalian and 

yeast cells (Figure 6B; Figure 7), repeated low-stringency hybridization screens failed 
to identify a Drosophila homolog of SMRT. Given that no SMRT/N-CoR homolog is 
found in C. elegans, it was believed that either a SMRT/N-CoR-like corepressor is not 
conserved in invertebrates or, alternatively, invertebrate corepressors may diverge 

30 significantly from their vertebrate counterparts. To pursue the isolation of an EcR 

corepressor, a yeast interaction screen of a Drosophila embryonic cDNA library using 
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EcR as bait was conducted as described in Example 19. This screen resulted in the 
isolation of a clone, E52, whose protein product interacts with EcR as well as with the 
vertebrate RAR and TR, but notably not with USP (Figure 7). Unlike the interaction 
between E52 and RAR, which can be dissociated by all-trans retinoic acid, the 
5 interaction between E52 and EcR, or the interaction between SMRT and EcR, is not 
dissociated by Muristerone A (MurA). This result suggests that other factors essential 
for the dissociation of E52 from EcR, such as USP, are missing in yeast (see below). 

Example 23 

10 Isolation and Characterization of an EcR-Interacting Clone 

E52 was isolated as one of the complementary positive clones from a 
yeast two-hybrid screen. Isolation of overlapping cDNA and genomic clones led to 
the identification of a full-length sequence encoding a large protein of 3446 amino 

1 5 acids (Figure 8A). This protein contains several unusually long stretches of Gin, Ala, 
Gly, and Ser repeats. Comparative analysis reveals it to be a novel protein with 
limited regions of clear homology with the vertebrate nuclear receptor corepressors 
SMRT and N-CoR (Chen and Evans, (1995), supra; Horlein et al., (1995), supra; 
Ordentlich et al., (1999), supra; Park et al., (1999), supra). This protein SMRTER, 

20 SMRT-related ecdysone receptor-interacting factor, was shown by Northern blot 
analysis to encode large transcripts (>12 kb) expressed broadly throughout the 
embryonic stage and three larvae stages, as well as in adult Drosophila flies. 

Example 24 

25 Molecular and Biochemical Analvsis for ERIDl and ERTD2 

Interaction with the EcR complex was evaluated based on transient 
transfection with the Gal4-SMRTER fusion genes. USP, EcR-vpl6 (VP16 
transactivating domain was fused C-terminally to the end of the EcRB 1 isoform), and 
30 the reporter, pMHlOO-TK-Luc. 
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In vitro pull down assays (Example 12) were conducted to determine 
whether EcR interacts with ERIDl and ERID2. In vitro translated 35S-methionine- 
labeled EcRBl alone, or a mixture of 35S-methionine-labeled EcRBl and unlabeled 
USP, or 35S-methionine-labeled USP alone, were incubated with GST, GST-ERIDl 
5 (amino acids 1698-2063 of SEQ ID NO: 1), or GST-ERID2 (amino acids of SEQ ID 
NO: 1). GST-ERIDl and GST-ERID2, but not GST alone, pull down labeled EcR, 
whereas little interaction is found between USP and any of the three GST proteins. In 
addition, the pull-down complex was disrupted by the addition of 3)iM MurA when 
USP is present. These in vitro results establish that SMRTER and EcR may interact 
1 0 directly. 

Fvirther in vitro tests were conducted to determine ERIDl, ERID2, and 
c-SMRT compete with each other to bind EcR. Gal4-ERID1 (amino acids 1698-2063 
of SEQ ID NO:l) or Gal4-ERID2 (amino acids 2929-3181 of SEQ ID NO:l), along 

1 5 with EcR-vpl6 and USP, were transfected in CV-1 cells as described above. In this 
competition experiment, additional ERIDl, ERID2, and c-SMRT (Chen et al, (1996), 
supra) were cotransfected into cells. ERIDl (1698-2063) and ERID2 ((amino acids 
2929-3038 of SEQ ID NO:l) were tagged with the nuclear targeting signal 
(MAPKKKRKV) (SEQ ID NO:3) to ensure that these proteins were locahzed in 

20 nuclei. As shown in Figure 11 C, interaction between each Gal4-ERID fusion and 
EcR-vpl6:USP was significantly decreased by both ERIDs and by c-SMRT. 
Interestingly, a more prominent effect was observed in experiments when Gal4- 
ERIDl (ammo acids 1698-2063 of SEQ ID NO:l) was challenged by ERID2, and, 
conversely, a more efficient competition was achieved by ERIDl to Gal4-ERID2 

25 (amino acids 2094-3 1 8 1 of SEQ ID NO: 1). Together, these results suggest that 
ERIDl, ERID2, and c-SMRT may bind similar or overlapping surface(s) in EcR. 
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Example 25 

SMRTER Colocalizes with the EcR on Polytene Chromosomes 



5 SMRTER antibodies were prepared as described in Example 12 to 

examine its cytological and chromosomal localization patterns of SMRTER. 
Consistent with its action as a corepressor of EcR, SMRTER was locaUzed to nuclei 
of salivary glands and of fat bodies, as well as to nuclei of eye, wing, and leg imaginal 
discs isolated from the third instar larvae. 

10 

Next association of SMRTER with the EcR:USP complex on 
chromosomes was examined. The USP staining pattern was used as an index for 
EcRs presence on chromosomes. Since USP and EcR colocalized with each other on 
poljrtene chromosomes (Yao et al., (1993), supra), chromosomal spreads prepared 
1 5 from the salivary glands of wandering third instar larvae (prior to pupariation) were 
subjected to simultaneous immunological staining with antibodies against SMRTER 
and USP. SMRTER was detected with antibody conjugated with Texas red, USP 
with FITC. 



20 To visualize the band, interband, and puffing patterns of the polytene 

chromosomes, the chromosomes were coimterstained with DAPI to show the banding 
regions while leaving the interbands and puffs unstained or lightly stained. Indirect 
immunofluorescence staining revealed that SMRTER is a chromosome-boimd protein 
and colocalizes with USP (FITC) at a majority of chromosomal sites; whereas in a 

25 pilot experiment, no such staining patterns were detected using the preimmunization 
serum. The strongest SMRTER staining was primarily associated with the boundary 
between band and interband regions as well as within the interband regions of 
chromosomes counterstained with DAPI. This result confirms that, as an EcR- 
associating factor, SMRTER is recruited by the EcR:USP heterodimers to their 

30 specific target chromosomal loci. 
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SMRTER staining can still be detected in puffed regions, such as the 
2B puff. Since the polytene chromosomes consist of a parallel arrangement of several 
hundred to two thousand copies of the euchromatic portions of the chromosomes, an 
individual binding protein like SMRTER may be cycling on and off, resulting in a 
5 steady state of signals detected in the broader chromatin regions. Whether or not 

SMRTER levels actually change prior to or after the peak of ecdysone pulses remains 
to be established. 

While the invention has been described in detail with reference to certain 
1 0 preferred embodiments thereof, it will be imderstood that modifications and variations 
are within the spirit and scope of that which is described and claimed. 



69 



That which is claimed is: 

1. An isolated polynucleotide encoding a member of a family of 
silencing mediators of retinoic acid receptor and thyroid hormone receptor, or an 
isoform or peptide portion thereof (SMRT co-repressor), or an isolated polynucleotide 
complementary thereto. 

2. The polynucleotide of claim 1, which modulates transcriptional 
potential of a member of the nuclear receptor superfamily (nuclear receptor). 

3. The polynucleotide of claim 2, wherein the SMRT co-repressor 
comprises a repression domain having 

a) less than about 83% identity with a Sin3A interaction 
domain of N-CoR set forth as amino acids 255 to 312 of SEQ ID NO: 1 1; 
5 b) less than about 57% identity with repression domain 1 of 

N-CoR set forth as amino acids 1 to 312 of SEQ ID NO: 11; 

c) less than about 66% identity with a SANT domain of 
N-CoR set forth as amino acids 312 to 668 of SEQ ID NO: 11; or 

d) less than about 30% identity with repression domain 2 of 
1 0 N-CoR set forth as amino acids 73 6 to 1 03 1 of SEQ ID NO: 1 1 , 

and polynucleotides that hybridize thereto under stringent 

conditions. 

4. The polynucleotide of claim 1, wherein the SMRT co-repressor is a 
human SMRT co-repressor having an amino acid sequence as set forth in SEQ ID 
NO: 5 or conservative variations thereof 

5. A polynucleotide which hybridizes under stringent conditions with 
a polynucleotide according to claim 2. 
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6. A polynucleotide that has at least 80% sequence identity with a 
polynucleotide according to claim 2. 

7. The polynucleotide of claim 4, which has a nucleotide sequence as 
set forth in SEQ ID NO: 4, and conservative variations thereof. 

8. The polynucleotide of claim 1, wherein the SMRT co-repressor is a 
mouse SMRTa isoform. 

9. The polynucleotide of claun 6, having an amino acid sequence as 
set forth in SEQ ID NO: 7 or conservative variations thereof 

10. The polynucleotide of claim 4, which has a nucleotide sequence as 
set forth in SEQ ID NO: 6. 

1 1 . The polynucleotide of claun 1, wherein the SMRT co-repressor is 
a mouse SMRTP isoform. 

12. The polynucleotide of claim 11, having an amino acid sequence as 
set forth in SEQ ID NO: 9 or conservative variations thereof. 



13. The polynucleotide of claim 11, which has a nucleotide sequence 
as set forth in SEQ ID NO: 8. 
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14. The polynucleotide of claim 1, comprising a nucleotide sequence 
selected from the group consisting of: 

nucleotides 1 to 3094 of SEQ ID NO: 4; 
nucleotides 1 to 3718 of SEQ ID NO: 6; and 
nucleotides 1 to 2801 of SEQ ID NO: 8. 

15. A polynucleotide that under stringent conditions with a 
polynucleotide according to claim 14, provided that the polynucleotide does not 
contain a sequence identical to SEQ ID NO: 11. 

16. A polynucleotide that has at least 80% sequence identity with a 
polynucleotide according to claim 14, provided that the polynucleotide does not 
contain a sequence identical to SEQ ID NO: 11. 

17. A polynucleotide of claim 1, comprising a nucleotide sequence 
selected from the group consisting of: 

nucleotides 1 to 8388 of SEQ ID NO: 6; and 
nucleotides 1 to 7465 of SEQ ID NO: 8. 

18. The polynucleotide of claim 1, comprising nucleotides 1 to 8561 
ofSEQIDNO: 4. 

19. The polynucleotide of claim 1, which is operably linked to a 
second nucleotide sequence. 
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20. The polynucleotide of claim 19, which encodes a fusion 
polypeptide comprising the SMRT co-repressor operably linked to a DNA binding 
domain of a transcription factor. 

21. A vector comprising the polynucleotide of claim 1 . 

22. A host cell containing the poljniucleotide of claim 1. 

23. An isolated oligonucleotide, comprising at least 15 nucleotides 
that can hybridize specifically to the polynucleotide of claim 1, but not to a 
polynucleotide encoding SEQ ID NO: 1 1 or to a polynucleotide encoding an amino 
acid sequence consisting of amino acids 1031 to 2517 of SEQ ID NO: 5. 

24. The oligonucleotide of claim 23, wherein the polynucleotide 
encodes at least five contiguous amino acids of a sequence selected from the group 
consisting of: 

amino acids 720 to 745 of SEQ ID NO: 5; 
amino acids 716 to 742 of SEQ ID NO: 7; and 
amino acids 497 to 523 of SEQ ID NO: 9. 



25. The oligonucleotide of claim 23, which can hybridize specifically 
to a polynucleotide encoding SEQ ID NO: 5 or SEQ ID NO: 7, but not to a 
polynucleotide encoding SEQ ID NO: 9. 
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26. An isolated silencing mediator of retinoic acid and thyroid 
hormone receptor, or isoform or peptide portion thereof (SMRT co-repressor), 
wherein the co-repressor modulates transcriptional potential of a member of the 
nuclear receptor superfamily (nuclear receptor). 

5 

27. An isolated co-repressor comprising a repression domain having 

a) less than about 83% identity with a Sin3A interaction 
domain of N-CoR set forth as amino acids 255 to 312 of SEQ ID NO: 11; 

b) less than about 57% identity with repression domain 1 of 
1 0 N-CoR set forth as amino acids 1 to 3 12 of SEQ ID NO: 11 ; 

c) less than about 66% identity with a SANT domain of 
N-CoR set forth as amino acids 312 to 668 of SEQ ID NO: 11; or 

d) less than about 30% identity with repression domain 2 of 
N-CoR set forth as amino acids 736 to 103 1 of SEQ ID NO: 1 1 . 

15 

28. An isolated peptide, comprising at least six contiguous amino 
acids of an amino acid sequence selected from the group consisting of: 

amino acids 1 to 1030 of SEQ ID NO: 5; 
amino acids 1 to 1029 of SEQ ID NO: 7; 
20 amino acids 1 to 809 of SEQ ID NO: 9; 

and conservative variations thereof, 
provided the peptide is not identical to a sequence of SEQ ID NO: 11. 

29. An isolated antibody that binds specifically to the peptide of claim 

28. 

30. A cell line, which produces the antibody of claim 29. 

31. A chimeric molecule, comprising the SMRT co-repressor of claim 
26 and at least a second molecule. 
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32. A complex, comprising a SMRT co-repressor of claim 26 and a 
member of the nuclear receptor superfamily (nuclear receptor). 

33. The complex of clakn 32, wherein the nuclear receptor is in the 
formof adimer. 

34. A method for identifying an agent that modulates the repressor 
potential of a SMRT co-repressor, the method comprising: 

a) contacting a host cell with an agent, 

wherein the host cell contains a first expressible nucleotide 
5 sequence operably linked to a first DNA regulatory element, and 

expresses a fusion polypeptide comprising a SMRT co- 
repressor of claim 26, and a DNA binding domain of a first transcription 
factor, which can specifically bind the first DNA regulatory element, 

and wherein binding of the DNA binding domain of the first 
1 0 transcription factor to the first DNA regulatory element results in expression 

of the first expressible nucleotide sequence; and 

b) detecting a change in the level of expression of the first 
expressible nucleotide sequence due to contacting the host cell with the agent, 
thereby identifying an agent that modulates the repressor potential of a SMRT 

1 5 co-repressor. 

35. A method for identifying an agent that modulates a function of a 
SMRT co-repressor, the method comprising: 

a) contacting a SMRT co-repressor of clarni 26, 

20 a member of the nuclear receptor superfamily (nuclear 

receptor), and 

an agent; and 

b) detecting an altered activity of the SMRT co-repressor in 
the presence of the agent as compared to the absence of the agent, thereby 

25 identifying an agent that modulates a fimction of the SMRT co-repressor. 
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36. A method of modulating the transcriptional potential of a member 
of the nuclear receptor superfamily (nuclear receptor) in a cell, the method comprising 
introducing a polynucleotide of claim 1 into the cell, whereby the polynucleotide or 

5 an expression product of the polynucleotide alters the level of a SMRT co-repressor in 
the cell, thereby modulating the transcriptional potential of the nuclear receptor. 

37. A method of identifying a molecule that interacts specifically with 
a SMRT co-repressor, the method comprising: 

1 0 a) contacting the molecule with the SMRT co-repressor of 

claim 2^ and 

b) detecting specific binding of the molecule to the SMRT co- 
repressor, thereby identifying a molecule that interacts specifically with a 
SMRT co-repressor. 



15 
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ABSTRACT OF THE INVENTION 

The present invention relates to isolated polynucleotides encoding a 
family of silencing mediators of retinoic acid and thyroid hormone receptor (SMRT) 
isoforms, including vertebrate and invertebrate isoforms thereof. For example, a full 
length human SMRT co-repressor, two isoforms of a mouse SMRT- a longer form, 
mouse SMRTa , and a shorter form, mouse SMRTp, and an isoform of an insect 
(Drosophilia), SMRTER - as well as peptide portions of the SMRT co-repressors that 
can modulate transcriptional potential of a member of the nuclear receptor 
superfamily (nuclear receptor); to ohgonucleotides that can hybridize specifically to 
such a polynucleotide; to vectors and to host cells containing such polynucleotides. 
The invention also relates to polypeptide SMRT co-repressors encoded by such 
invention SMRT polynucleotides, and to peptide portions thereof that can modulate 
transcriptional potential of a nuclear receptor; including peptide portions of a SMRT 
co-repressor that are not present in an N-CoR polypeptide. In addition, the invention 
relates to chimeric molecules and to complexes containing a SMRT co-repressor or 
peptide portion thereof, to antibodies that specifically bind such compositions, and to 
methods for identifymg an agent that modulates the repressor potential of a SMRT co- 
repressor. The invention also provides methods for identifying an agent that 
modulates a function of a SMRT co-repressor; for modulating the transcriptional 
potential of a nuclear receptor in a cell using the compositions of the invention; and 
for identifying a molecule that interacts specifically with a SMRT co-repressor. 
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2081 PEP<rrlJ>I^IiaXPlUXaUSPHTGA<»SSSS<!SOSaC»SSSiamHaPPPPTlISMKHrVFS<WKr^ 2160 

21S1 WTRSVKOTeGGOTVPSVlJ?OVPOIM.ri«PVPVPVPISISOQ<MI*rXAaaPPPAQPPS^^ 2240 

2241 psaGasPSQQQQ<JcaaQ<XKJQQfiQ*JU**QQQl*^'l^ss2IISSIEAH^^ 2320 

2401 VCVSSP 

2481 V8SAS<SFAY<S<3IMCESAPRORWnrSS»ASP»r 

2561 s<aanXIASFVDVAVWPQLPVPS0KDDKSP(»SXJtfOQyP0»OPPLGPSPIiPPB&VVO^ 2640 

t EHHTX<V<2Q<2XA 
2721 ERZREORHRBRERZRiatgB 

1 IlSIPOREBESYYIlQAHaOPAPEDTPCQI.SAQSIj:DAlIX 
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SEQUENCE LISTING 

<110> Evans, Ronald M. 
Chen , J . Don 

<120> A FAMILY OF TRANSCRIPTIONAL 

CO-REPRESSORS THAT INTERACT WITH NUCLEAR HORMONE RECEPTORS 
AND USES THEREFOR 

<130> SALK1510-3 

<150> 09/337,384 
<151> 1999-06-21 

<150> 08/522,726 
<151> 1995-09-01 

<160> 11 

<170> FastSEQ for Windows Version 4.0 

<210> 1 
<211> 1495 
<212> PRT 

<213> Homo sapiens 



<400> 1 



Met 


Glu 


Ala 


Trp 


Asp 


Ala 


His 


Pro 


Asp 


Lys 


Glu 


Ala 


Phe 


Ala 


Ala 


Glu 


1 








5 










10 










15 




Ala 


Gin 


Lys 


Leu 


Pro 


Gly 


Asp 


Pro 


Pro 


Cys 


Trp 


Thr 


Ser 


Gly 


Leu 


Pro 








20 










25 










30 






Phe 


Pro 


Val 


Pro 


Pro 


Arg 


Glu 


Val 


He 


Lys 


Ala 


Ser 


Pro 


His 


Ala 


Pro 






35 










40 










45 








Asp 


Pro 


Ser 


Ala 


Phe 


Ser 


Tyr 


Ala 


Pro 


Pro 


Gly 


His 


Pro 


Leu 


Pro 


Leu 




50 










55 










60 










Gly 


Leu 


His 


Asp 


Thr 


Ala 


Arg 


Pro 


Val 


Leu 


Pro 


Arg 


Pro 


Pro 


Thr 


He 


65 










70 










75 










80 


Ser 


Asn 


Pro 


Pro 


Pro 


Leu 


He 


Ser 


Ser 


Ala 


Lys 


His 


Pro 


Ser 


Val 


Leu 










85 










90 










95 




Glu 


Arg 


Gin 


He 


Gly 


Ala 


He 


Ser 


Gin 


Gly 


Met 


Ser 


Val 


Gin 


Leu 


His 








100 










105 










110 






Val 


Pro 


Tyr 


Ser 


Glu 


His 


Ala' 


Lys 


Ala 


Pro 


Val 


Gly 


Pro 


Val 


Thr 


Met 






115 










120 










125 








Gly 


Leu 


Pro 


Leu 


Pro 


Met 


Asp 


Pro 


Lys 


Lys 


Leu 


Ala 


Pro 


Phe 


Ser 


Gly 




130 










135 










140 










Val 


Lys 


Gin 


Glu 


Gin 


Leu 


Ser 


Pro 


Arg 


Gly 


Gin 


Ala 


Gly 


Pro 


Pro 


Glu 


145 










150 










155 










160 


Ser 


Leu 


Gly 


Val 


Pro 


Thr 


Ala 


Gin 


Glu 


Ala 


Ser 


Val 


Leu 


Arg 


Gly 


Thr 










165 










170 










175 




Ala 


Leu 


Gly 


Ser 


Val 


Pro 


Gly 


Gly 


Ser 


He 


Thr 


Lys 


Gly 


He 


Pro 


Ser 








180 










185 










190 






Thr 


Arg 


Val 


Pro 


Ser 


Asp 


Ser 


Ala 


He 


Thr 


Tyr 


Arg 


Gly 


Ser 


He 


Thr 






195 










200 










205 








His 


Gly 


Thr 


Pro 


Ala 


Asp 


Val 


Leu 


Tyr 


Lys 


Gly 


Thr 


He 


Thr 


Arg 


He 




210 










215 










220 










lie 


Gly 


Glu 


Asp 


Ser 


Pro 


Ser 


Arg 


Leu 


Asp 


Arg 


Gly 


Arg 


Glu 


Asp 


Ser 


225 










230 










235 










240 


Leu 


Pro 


Lys 


Gly 


His 


Val 


He 


Tyr 


Glu 


Gly 


Lys 


Lys 


Gly 


His 


Val 


Leu 










245 










250 










255 





Ser 


Tyr 


Glu 


Gly 


Gly 


Met 








260 






Arg 


Ser 


Ser 


Ser 


Gly 


Pro 






275 








Tyr 


Asp 


Met 


Met 


Glu 


Gly 




290 










He 


Glu 


Gly 


Leu 


Met 


Gly 


305 










310 


His 


His 


Leu 


Lys 


Glu 


Gin 










325 




He 


Pro 


Arg 


Ser 


Tyr 


Val 








340 






Ala 


Lys 


Leu 


Leu 


Lys 


Arg 






355 








Arg 


Asp 


Leu 


Thr 


Glu 


Ala 




370 










Leu 


Lys 


Pro 


Ala 


His 


Glu 


385 










390 


Arg 


Ser 


He 


His 


Glu 


He 










405 




Leu 


Pro 


Leu 


Ala 


Pro 


Arg 








420 






Thr 


Pro 


Leu 


Lys 


Tyr 


Asp 






435 








His 


Asp 


Val 


Arg 


Ser 


Leu 




450 










Val 


His 


Pro 


Leu 


Asp 


Val 


465 










470 


Cys 


Tyr 


Glu 


Glu 


Ser 


Leu 










485 




Gly 


Gly 


Ser 


He 


Ala 


Arg 








500 






Lys 


Pro 


Arg 


Gin 


Ser 


Pro 






515 








Ala 


Gly 


His 


Leu 


Pro 


Arg 




530 










Pro 


Arg 


Leu 


Gin 


Glu 


Gly 


545 










550 


Arg 


Lys 


Leu 


Thr 


Ser 


Thr 










565 




Thr 


Val 


Pro 


Glu 


His 


His 








580 






Leu 


Arg 


Gly 


Val 


Ser 


Gly 






595 








Ala 


Phe 


Asp 


Pro 


Thr 


Ser 




610 










Ala 


Ala 


Tyr 


Tyr 


Leu 


Pro 


625 










630 


His 


Leu 


Tyr 


Pro 


Pro 


Tyr 










645 




Leu 


Glu 


Asn 


Arg 


Gin 


Thr 








660 






Met 


His 


His 


Asn 


Thr 


Ala 






675 








Arg 


Gly 


Leu 


Ser 


Pro 


Arg 




690 










Gly 


Pro 


Arg 


Gly 


He 


He 


705 










710 


Leu 


Val 


Pro 


Pro 


Thr 


Pro 



725 



2 



Ser 


Val 


Thr 


Gin 


Cys 


Ser 


Lys 


Glu 


Asp 


Gly 






265 










270 






Pro 


His 


Glu 


Thr 


Ala 


Ala 


Pro 


Lys 


Arg 


Thr 




280 










285 








Arg 


Val 


Gly 


Arg 


Ala 


He 


Ser 


Ser 


Ala 


Ser 


295 










300 










Arg 


Ala 


He 


Pro 


Pro 


Glu 


Arg 


His 


Ser 


Pro 










315 










320 


His 


His 


He 


Arg 


Gly 


Ser 


He 


Thr 


Gin Gly 








330 










335 




Glu 


Ala 


Gin 


Glu 


Asp 


Tyr 


Leu 


Arg 


Arg 


Glu 






345 










350 






Glu 


Gly 


Thr 


Pro 


Pro 


Pro 


Pro 


Pro 


Pro 


Ser 




360 










365 








Tyr 


Lys 


Thr 


Gin 


Ala 


Leu 


Gly 


Pro 


Leu 


Lys 


375 










380 










Gly 


Leu 


Val 


Ala 


Thr 


Val 


Lys 


Glu 


Ala 


Gly 










395 










400 


Pro 


Arg 


Glu 


Glu 


Leu 


Arg 


His 


Thr 


Pro 


Glu 








410 










415 




Pro 


Leu 


Lys 


Glu 


Gly 


Ser 


He 


Thr 


Gin Gly 






425 










430 






Thr 


Gly 


Ala 


Ser 


Thr 


Thr 


Gly 


Ser 


Lys 


Lys 




440 










445 








He 


Gly 


Ser 


Pro 


Gly 


Arg 


Thr 


Phe 


Pro 


Pro 


455 










460 










Met 


Ala 


Asp 


Ala 


Arg 


Ala 


Leu 


Glu 


Arg 


Ala 










475 










480 


Lys 


Ser 


Arg 


Pro Gly 


Thr 


Ala 


Ser 


Ser 


Ser 








490 










495 




Gly 


Ala 


Pro 


Val 


He 


Val 


Pro 


Glu 


Leu 


Gly 






505 










510 






Leu 


Thr 


Tyr 


Glu 


Asp 


His 


Gly 


Ala 


Pro 


Phe 




520 










525 








Gly 


Ser 


Pro 


Val 


Thr 


Met 


Arg 


Glu 


Pro 


Thr 


535 










540 










Ser 


Leu 


Ser 


Ser 


Ser 


Lys 


Ala 


Ser 


Gin 


Asp 










555 










560 


Pro 


Arg 


Glu 


He 


Ala 


Lys 


Ser 


Pro 


His 


Ser 








570 










575 




Pro 


His 


Pro 


He 


Ser 


Pro 


Tyr 


Glu 


His 


Leu 






585 










590 






Val 


Asp 


Leu 


Tyr 


Arg 


Ser 


His 


He 


Pro 


Leu 




600 










605 








He 


Pro 


Arg 


Gly 


He 


Pro 


Leu 


Asp 


Ala 


Ala 


615 










620 










Arg 


His 


Leu 


Ala 


Pro 


Asn 


Pro 


Thr 


Tyr 


Pro 










635 










640 


Leu 


He 


Arg 


Gly Tyr 


Pro 


Asp 


Thr 


Ala 


Ala 








650 










655 




He 


He 


Asn 


Asp 


Tyr 


He 


Thr 


Ser 


Gin 


Gin 






665 










670 






Thr 


Ala 


Met 


Ala 


Gin 


Arg 


Ala 


Asp 


Met 


Leu 




680 










685 








Glu 


Ser 


Ser 


Leu 


Ala 


Leu 


Asn 


Tyr 


Ala 


Ala 


695 










700 










Asp 


Leu 


Ser 


Gin 


Val 


Pro 


His 


Leu 


Pro 


Val 










715 










720 


Gly 


Thr 


Pro 


Ala 


Thr 


Ala 


Met 


Asp 


Arg 


Leu 








730 










735 





3 



Ala 


Tyr 


Leu 


Pro 


Thr 


Ala 


Pro 


Gin 


Pro 


Phe 


Ser 


Ser 


Arg 


His 


Ser 


Ser 








740 










745 










750 






Ser 


Pro 


Leu 


Ser 


Pro 


Gly 


Gly 


Pro 


Thr 


His 


Leu 


Thr 


Lys 


Pro 


Thr 


Thr 






755 










760 










765 








Thr 


Ser 


Ser 


Ser 


Glu 


Arg 


Glu 


Arg 


Asp 


Arg 


Asp 


Arg 


Glu 


Arg 


Asp 


Arg 




770 










775 










780 










Asp 


Arg 


Glu 


Arg 


Glu 


Lys 


Ser 


He 


Leu 


Thr 


Ser 


Thr 


Thr 


Thr 


Val 


Glu 


785 










790 










795 










800 


His 


Ala 


Pro 


He 


Trp 


Arg 


Pro 


Gly 


Thr 


Glu 


Gin 


Ser 


Ser 


Gly 


Ser 


Ser 










805 










810 










815 




Gly 


Ser 


Ser 


Gly 


Gly 


Gly 


Gly 


Gly 


Ser 


Ser 


Ser 


Arg 


Pro 


Ala 


Ser 


His 








820 










825 










830 






Ser 


His 


Ala 


His 


Gin 


His 


Ser 


Pro 


He 


Ser 


Pro 


Arg 


Thr 


Gin 


Asp 


Ala 






835 










840 










845 








Leu 


Gin 


Gin 


Arg 


Pro 


Ser 


Val 


Leu 


His 


Asn 


Thr 


Gly 


Met 


Lys 


Gly 


He 




850 










855 










860 










lie 


Thr 


Ala 


Val 


Glu 


Pro 


Ser 


Lys 


Pro 


Thr 


Val 


Leu 


Arg 


Ser 


Thr 


Ser 


865 










870 










875 










880 


Thr 


Ser 


Ser 


Pro 


Val 


Arg 


Pro 


Ala 


Ala 


Thr 


Phe 


Pro 


Pro 


Ala 


Thr 


His 










885 










890 










895 




Cys 


Pro 


Leu 


Gly 


Gly 


Thr 


Leu 


Asp 


Gly 


Val 


Tyr 


Pro 


Thr 


Leu 


Met 


Glu 








900 










905 










910 






Pro 


Val 


Leu 


Leu 


Pro 


Lys 


Glu 


Ala 


Pro 


Arg 


Val 


Ala 


Arg 


Pro 


Glu 


Arg 






915 










92 0 










925 








Pro 


Arg 


Ala 


Asp 


Thr 


Gly 


His 


Ala 


Phe 


Leu 


Ala 


Lys 


Pro 


Pro 


Ala 


Arg 




930 










935 










940 










Ser 


Gly 


Leu 


Glu 


Pro 


Ala 


Ser 


Ser 


Pro 


Ser 


Lys 


Gly 


Ser 


Glu 


Pro 


Arg 


94 5 










950 










955 










960 


Pro 


Leu 


Val 


Pro 


Pro 


Val 


Ser 


Gly 


His 


Ala 


Thr 


He 


Ala 


Arg 


Thr 


Pro 










965 










970 










975 




Ala 


Lys 


Asn 


Leu 


Ala 


Pro 


His 


His 


Ala 


Ser 


Pro 


Asp 


Pro 


Pro 


Ala 


Pro 








980 










985 










990 






Pro 


Ala 


Ser 


Ala 


Ser 


Asp 


Pro 


His 


Arg 


Glu 


Lys 


Thr 


Gin 


Ser 


Lys 


Pro 






995 










1000 








1005 






Phe 


Ser 


He 


Gin 


Glu 


Leu 


Glu 


Leu 


Arg 


Ser 


Leu 


Gly 


Tyr 


His 


Gly 


Ser 




1010 








1015 








1020 








Ser 


Tyr 


Ser 


Pro 


Glu 


Gly 


Val 


Glu 


Pro 


Val 


Ser 


Pro 


Val 


Ser 


Ser 


Pro 


1025 








1030 








1035 








104( 


Ser 


Leu 


Thr 


His 


Asp 


Lys 


Gly 


Leu 


Pro 


Lys 


His 


Leu 


Glu 


Glu 


Leu 


Asp 










1045 








1050 








1055 


Lys 


Ser 


His 


Leu 


Glu 


Gly 


Glu 


Leu 


Arg 


Pro 


Lys 


Gin 


Pro 


Gly 


Pro 


Val 








1060 








1065 








1070 




Lys 


Leu 


Gly 


Gly 


Glu 


Ala 


Ala 


His 


Leu 


Pro 


His 


Leu 


Arg 


Pro 


Leu 


Pro 






1075 








108( 


) 








1085 






Glu 


Ser 


Gin 


Pro 


Ser 


Ser 


Ser 


Pro 


Leu 


Leu 


Gin 


Thr 


Ala 


Pro 


Gly Val 




1090 








1095 








1100 








Lys 


Gly His 


Gin 


Arg 


Val 


Val 


Thr 


Leu 


Ala 


Gin 


His 


He 


Ser 


Glu 


Val 


1105 








1110 








1115 








112( 


lie 


Thr 


Gin 


Asp 


Tyr 


Thr 


Arg 


His 


His 


Pro 


Gin 


Gin 


Leu 


Ser 


Ala 


Pro 










1125 








1130 








1135 


Leu 


Pro 


Ala 


Pro 


Leu 


Tyr 


Ser 


Phe 


Pro 


Gly Ala 


Ser 


Cys 


Pro 


Val 


Leu 








1140 








1145 








1150 




Asp 


Leu Arg 


Arg 


Pro 


Pro 


Ser 


Asp 


Leu 


Tyr 


Leu 


Pro 


Pro 


Pro 


Asp 


His 






1155 








1160 








1165 






Gly Ala Pro Ala Arg Gly 


Ser 


Pro 


His 


Ser 


Glu 


Gly Gly Lys 


Arg 


Ser 




1170 








1175 








1180 








Pro 


Glu 


Pro 


Asn 


Lys 


Thr 


Ser 


Val 


Leu 


Gly Gly 


Gly Glu Asp 


Gly 


He 


1185 








1190 








1195 








120( 


Glu 


Pro 


Val 


Ser 


Pro 


Pro Glu Gly Met 


Thr 


Glu 


Pro 


Gly His 


Ser 


Arg 



1205 1210 1215 



4 

Ser Ala Val Tyr Pro Leu Leu Tyr Arg Asp Gly Glu Gin Thr Glu Pro 

1220 1225 1230 

Ser Arg Met Gly Ser Lys Ser Pro Gly Asn Thr Ser Gin Pro Pro Ala 

1235 1240 1245 

Phe Phe Ser Lys Leu Thr Glu Ser Asn Ser Ala Met Val Lys Ser Lys 

1250 1255 1260 

Lys Gin Glu lie Asn Lys Lys Leu Asn Thr His Asn Arg Asn Glu Pro 
1265 1270 1275 1280 

Glu Tyr Asn lie Ser Gin Pro Gly Thr Glu lie Phe Asn Met Pro Ala 

1285 1290 1295 

He Thr Gly Thr Gly Leu Met Thr Tyr Arg Ser Gin Ala Val Gin Glu 

1300 1305 1310 

His Ala Ser Thr Asn Met Gly Leu Glu Ala He He Arg Lys Ala Leu 

1315 1320 1325 

Met Gly Lys Tyr Asp Gin Trp Glu Glu Ser Pro Pro Leu Ser Ala Asn 

1330 1335 1340 

Ala Phe Asn Pro Leu Asn Ala Ser Ala Ser Leu Pro Ala Ala Met Pro 
1345 1350 1355 1360 

He Thr Ala Ala Asp Gly Arg Ser Asp His Thr Leu Thr Ser Pro Gly 

1365 1370 1375 

Gly Gly Gly Lys Ala Lys Val Ser Gly Arg Pro Ser Ser Arg Lys Ala 

1380 1385 1390 

Lys Ser Pro Ala Pro Gly Leu Ala Ser Gly Asp Arg Pro Pro Ser Val 

1395 1400 1405 

Ser Ser Val His Ser Glu Gly Asp Cys Asn Arg Arg Thr Pro Leu Thr 

1410 1415 1420 

Asn Arg Val Trp Glu Asp Arg Pro Ser Ser Ala Gly Ser Thr Pro Phe 
1425 1430 1435 1440 

Pro Tyr Asn Pro Leu He Met Arg Leu Gin Ala Gly Tyr Met Ala Ser 

1445 1450 1455 

Pro Pro Pro Pro Gly Leu Pro Ala Gly Ser Gly Pro Leu Ala Gly Pro 

1460 1465 1470 

His His Ala Trp Asp Glu Glu Pro Lys Pro Leu Leu Cys Ser Gin Tyr 

1475 1480 1485 

Glu Thr Leu Ser Asp Ser Glu 
1490 1495 

<210> 2 
<211> 46 
<212> PRT 

<213> Homo sapiens 
<400> 2 

Gly Lys Tyr Asp Gin Trp Glu Glu Ser Pro Pro Leu Ser Ala Asn Ala 

15 10 15 

Phe Asn Pro Leu Asn Ala Ser Ala Ser Leu Pro Ala Ala Met Pro He 

20 25 30 

Thr Ala Ala Asp Gly Arg Ser Asp His Thr Leu Thr Ser Pro 
35 40 45 

<210> 3 

<211> 17 

<212> DNA 

<213> Homo sapiens 

<400> 3 

cggaggactg tcctccg 



<210> 4 
<211> 8561 



<212> DNA 

<213> Homo sapiens 



5 



<220> 

<221> CDS 

<222> (2) . . . (7555) 

<400> 4 

c atg teg ggc tec aca cag ctt gtg gca eag aeg tgg agg gee act gag 
Met Ser Gly Ser Thr Gin Leu Val Ala Gin Thr Trp Arg Ala Thr Glu 



cce cgc tac ceg cce cac age ctt tec tac cca gtg cag ate gee egg 
Pro Arg Tyr Pro Pro His Ser Leu Ser Tyr Pro Val Gin lie Ala Arg 



acg cac aeg gac gtc ggg etc ctg gag tac cag cac cac tec cgc gae 
Thr His Thr Asp Val Gly Leu Leu Glu Tyr Gin His His Ser Arg Asp 



tat gee tec cac ctg teg ccg ggc tec ate ate cag cec cag egg egg 
Tyr Ala Ser His Leu Ser Pro Gly Ser lie lie Gin Pro Gin Arg Arg 



agg cce tec ctg ctg tct gag ttc eag eec ggg aat gaa egg tee cag 
Arg Pro Ser Leu Leu Ser Glu Phe Gin Pro Gly Asn Glu Arg Ser Gin 



gag etc cac ctg egg cca gag tee cac tea tac ctg cce gag ctg ggg 
Glu Leu His Leu Arg Pro Glu Ser His Ser Tyr Leu Pro Glu Leu Gly 



aag tea gag atg gag ttc att gaa age aag cgc cct egg eta gag ctg 
Lys Ser Glu Met Glu Phe lie Glu Ser Lys Arg Pro Arg Leu Glu Leu 
100 105 110 

ctg cct gac cce ctg ctg ega ccg tea cce ctg ctg gee acg ggc cag 
Leu Pro Asp Pro Leu Leu Arg Pro Ser Pro Leu Leu Ala Thr Gly Gin 
115 120 125 

cct geg gga tct gaa gae etc ace aag gac cgt age ctg acg ggc aag 
Pro Ala Gly Ser Glu Asp Leu Thr Lys Asp Arg Ser Leu Thr Gly Lys 
130 135 140 

ctg gaa ccg gtg tct eec cec age cce ccg cac act gac cct gag ctg 
Leu Glu Pro Val Ser Pro Pro Ser Pro Pro His Thr Asp Pro Glu Leu 
145 150 155 160 

gag ctg gtg ccg cca egg ctg tec aag gag gag ctg ate eag aac atg 
Glu Leu Val Pro Pro Arg Leu Ser Lys Glu Glu Leu lie Gin Asn Met 
165 170 175 

gae cgc gtg gac ega gag ate acc atg gta gag cag cag ate tct aag 
Asp Arg Val Asp Arg Glu lie Thr Met Val Glu Gin Gin lie Ser Lys 
180 185 190 

ctg aag aag aag cag caa eag ctg gag gag gag get gee aag ccg cce 
Leu Lys Lys Lys Gin Gin Gin Leu Glu Glu Glu Ala Ala Lys Pro Pro 
195 200 205 



6 

gag cct gag aag ccc gtg tea ccg ccg ccc ate gag teg aag cac cgc 
Glu Pro Glu Lys Pro Val Ser Pro Pro Pro lie Glu Ser Lys His Arg 
210 215 220 

age ctg gtg cag ate ate tac gac gag aac egg aag aag get gaa get 
Ser Leu Val Gin He He Tyr Asp Glu Asn Arg Lys Lys Ala Glu Ala 
225 230 235 240 

gca cat egg att ctg gaa gge ctg ggg ccc cag gtg gag ctg ccg ctg 
Ala His Arg He Leu Glu Gly Leu Gly Pro Gin Val Glu Leu Pro Leu 
245 250 255 

tac aac cag ccc tec gac ace egg cag tat eat gag aac ate aaa ata 
Tyr Asn Gin Pro Ser Asp Thr Arg Gin Tyr His Glu Asn He Lys He 
260 265 270 

aac cag geg atg egg aag aag eta ate ttg tac ttc aag agg agg aat 
Asn Gin Ala Met Arg Lys Lys Leu He Leu Tyr Phe Lys Arg Arg Asn 
275 280 285 

cac get egg aaa caa tgg aag cag aag ttc tgc cag cgc tat gac cag 
His Ala Arg Lys Gin Trp Lys Gin Lys Phe Cys Gin Arg Tyr Asp Gin 
290 295 300 

etc atg gag gee ttg gaa aaa aag gtg gag cgc ate gaa aac aac ccg 
Leu Met Glu Ala Leu Glu Lys Lys Val Glu Arg He Glu Asn Asn Pro 
305 310 315 320 

cgc egg egg gcc aag gag age aag gtg cgc gag tac tac gaa aag cag 
Arg Arg Arg Ala Lys Glu Ser Lys Val Arg Glu Tyr Tyr Glu Lys Gin 
325 330 335 

ttc cct gag ate cgc aag cag cgc gag ctg cag gag cgc atg cag age 
Phe Pro Glu He Arg Lys Gin Arg Glu Leu Gin Glu Arg Met Gin Ser 
340 345 350 

agg gtg gge cag egg gge agt ggg ctg tec atg teg gcc gcc cgc age 
Arg Val Gly Gin Arg Gly Ser Gly Leu Ser Met Ser Ala Ala Arg Ser 
355 360 365 

gag cac gag gtg tea gag ate ate gat gge etc tea gag cag gag aac 
Glu His Glu Val Ser Glu He He Asp Gly Leu Ser Glu Gin Glu Asn 
370 375 380 

ctg gag aag cag atg cgc cag ctg gee gtg ate ccg ccc atg ctg tac 
Leu Glu Lys Gin Met Arg Gin Leu Ala Val He Pro Pro Met Leu Tyr 
385 390 395 400 

gac get gac cag cag cgc ate aag ttc ate aac atg aac ggg ctt atg 
Asp Ala Asp Gin Gin Arg He Lys Phe He Asn Met Asn Gly Leu Met 
405 410 415 

gcc gac ccc atg aag gtg tac aaa gac cgc cag gtc atg aac atg tgg 
Ala Asp Pro Met Lys Val Tyr Lys Asp Arg Gin Val Met Asn Met Trp 
420 425 430 

agt gag cag gag aag gag ace ttc egg gag aag ttc atg cag cat ccc 
Ser Glu Gin Glu Lys Glu Thr Phe Arg Glu Lys Phe Met Gin His Pro 
435 440 445 
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aag aac ttt ggc ctg ate gca tea ttc ctg gag agg aag aca gtg get 
Lys Asn Phe Gly Leu He Ala Ser Phe Leu Glu Arg Lys Thr Val Ala 
450 455 460 

gag tgc gtc etc tat tac tae etg act aag aag aat gag aac tat aag 
Glu Cys Val Leu Tyr Tyr Tyr Leu Thr Lys Lys Asn Glu Asn Tyr Lys 
465 470 475 480 

age ctg gtg aga egg age tat egg egc cge gge aag age eag eag eaa 
Ser Leu Val Arg Arg Ser Tyr Arg Arg Arg Gly Lys Ser Gin Gin Gin 
485 490 495 

caa eag cag cag eag eag cag eag eag cag eag cag eag eag cec atg 
Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Pro Met 
500 505 510 

cec cgc age age cag gag gag aaa gat gag aag gag aag gaa aag gag 
Pro Arg Ser Ser Gin Glu Glu Lys Asp Glu Lys Glu Lys Glu Lys Glu 
515 520 525 

geg gag aag gag gag gag aag ccg gag gtg gag aac gac aag gaa gac 
Ala Glu Lys Glu Glu Glu Lys Pro Glu Val Glu Asn Asp Lys Glu Asp 
530 535 540 

etc etc aag gag aag aca gac gac acc tea ggg gag gac aac gae gag 
Leu Leu Lys Glu Lys Thr Asp Asp Thr Ser Gly Glu Asp Asn Asp Glu 
545 550 555 560 

aag gag get gtg gee tec aaa gge egc aaa act gee aac age eag gga 
Lys Glu Ala Val Ala Ser Lys Gly Arg Lys Thr Ala Asn Ser Gin Gly 
565 570 575 

aga egc aaa gge cgc ate acc egc tea atg get aat gag gee aac age 
Arg Arg Lys Gly Arg He Thr Arg Ser Met Ala Asn Glu Ala Asn Ser 
580 585 590 

gag gag gee ate acc cec cag eag age gee gag ctg gee tee atg gag 
Glu Glu Ala He Thr Pro Gin Gin Ser Ala Glu Leu Ala Ser Met Glu 
595 600 505 

etg aat gag agt tet cgc tgg aca gaa gaa gaa atg gaa aca gee aag 
Leu Asn Glu Ser Ser Arg Trp Thr Glu Glu Glu Met Glu Thr Ala Lys 
610 615 620 

aaa ggt etc etg gaa eac gge egc aac tgg teg gee ate gee egg atg 
Lys Gly Leu Leu Glu His Gly Arg Asn Trp Ser Ala He Ala Arg Met 
625 630 635 640 

gtg ggc tec aag act gtg teg cag tgt aag aac ttc tac ttc aac tac 
Val Gly Ser Lys Thr Val Ser Gin Cys Lys Asn Phe Tyr Phe Asn Tyr 
645 650 655 

aag aag agg cag aac etc gat gag ate ttg cag eag eac aag etg aag 
Lys Lys Arg Gin Asn Leu Asp Glu He Leu Gin Gin His Lys Leu Lys 
660 665 670 

atg gag aag gag agg aac geg egg agg aag aag aag aaa geg ccg gcg 
Met Glu Lys Glu Arg Asn Ala Arg Arg Lys Lys Lys Lys Ala Pro Ala 
675 680 685 



gcg gcc age gag gag get gca ttc ccg ccc gtg gtg gag gat gag gag 
Ala Ala Ser Glu Glu Ala Ala Phe Pro Pro Val Val Glu Asp Glu Glu 
690 695 700 

atg gag gcg teg ggc gtg age gga aat gag gag gag atg gtg gag gag 
Met Glu Ala Ser Gly Val Ser Gly Asn Glu Glu Glu Met Val Glu Glu 
705 710 715 720 

get gaa gee tta cat gee tct ggg aat gag gtg ecc aga ggg gaa tgc 
Ala Glu Ala Leu His Ala Ser Gly Asn Glu Val Pro Arg Gly Glu Cys 
725 730 735 

agt ggc cca gee act gtc aae aae age tea gac ace gag age ate ecc 
Ser Gly Pro Ala Thr Val Asn Asn Ser Ser Asp Thr Glu Ser lie Pro 
740 745 750 

tct ect eac act gag gee gee aag gae aca ggg eag aat ggg ecc aag 
Ser Pro His Thr Glu Ala Ala Lys Asp Thr Gly Gin Asn Gly Pro Lys 
755 760 765 

ecc cca gcc ace ctg ggc gee gac ggg cca ecc cca ggc cca cce acc 
Pro Pro Ala Thr Leu Gly Ala Asp Gly Pro Pro Pro Gly Pro Pro Thr 
770 775 780 

cca cca egg agg aca tec egg gcc ecc att gag ecc acc ccg gee tct 
Pro Pro Arg Arg Thr Ser Arg Ala Pro lie Glu Pro Thr Pro Ala Ser 
785 790 795 800 

gaa gee aee gga gcc cct acg ccc cca cca gca ccc cca teg ccc tct 
Glu Ala Thr Gly Ala Pro Thr Pro Pro Pro Ala Pro Pro Ser Pro Ser 
805 810 815 

gca cct cet ect gtg gtc ecc aag gag gag aag gag gag gag acc gca 
Ala Pro Pro Pro Val Val Pro Lys Glu Glu Lys Glu Glu Glu Thr Ala 
820 825 830 

gca gcg ecc cca gtg gag gag ggg gag gag eag aag ccc cce gcg get 
Ala Ala Pro Pro Val Glu Glu Gly Glu Glu Gin Lys Pro Pro Ala Ala 
835 840 845 

gag gag ctg gca gtg gac aca ggg aag gcc gag gag cce gtc aag age 
Glu Glu Leu Ala Val Asp Thr Gly Lys Ala Glu Glu Pro Val Lys Ser 
850 855 860 

gag tgc acg gag gaa gee gag gag ggg ccg gee aag ggc aag gac gcg 
Glu Cys Thr Glu Glu Ala Glu Glu Gly Pro Ala Lys Gly Lys Asp Ala 
865 870 875 880 

gag gcc get gag gcc acg gcc gag ggg gcg etc aag gca gag aag aag 
Glu Ala Ala Glu Ala Thr Ala Glu Gly Ala Leu Lys Ala Glu Lys Lys 
885 890 895 

gag ggc ggg age ggc agg gcc acc act gcc aag age teg ggc gcc ccc 
Glu Gly Gly Ser Gly Arg Ala Thr Thr Ala Lys Ser Ser Gly Ala Pro 
900 905 910 

cag gac age gac tec agt get acc tgc agt gea gac gag gtg gat gag 
Gin Asp Ser Asp Ser Ser Ala Thr Cys Ser Ala Asp Glu Val Asp Glu 
915 920 925 
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gcc gag ggc ggc gac aag aac egg ctg ctg tec cca agg ccc age etc 

Ala Glu Gly Gly Asp Lys Asn Arg Leu Leu Ser Pro Arg Pro Ser Leu 

930 935 940 

etc ace ceg act gge gac ccc egg gcc aat gcc tea ccc cag aag cca 
Leu Thr Pro Thr Gly Asp Pro Arg Ala Asn Ala Ser Pro Gin Lys Pro 
945 950 955 960 

ctg gac ctg aag cag ctg aag cag cga gcg get gcc ate ccc ccc ate 
Leu Asp Leu Lys Gin Leu Lys Gin Arg Ala Ala Ala lie Pro Pro lie 
965 970 975 

cag gte ace aaa gte cat gag ccc ccc egg gag gac gea get ccc aec 
Gin Val Thr Lys Val His Glu Pro Pro Arg Glu Asp Ala Ala Pro Thr 
980 985 990 

aag cca get ccc cca gcc cca ceg eea ceg caa aae ctg cag ceg gag 
Lys Pro Ala Pro Pro Ala Pro Pro Pro Pro Gin Asn Leu Gin Pro Glu 
995 1000 1005 

age gac gcc cct eag cag cet ggc age age ccc egg ggc aag age agg 
Ser Asp Ala Pro Gin Gin Pro Gly Ser Ser Pro Arg Gly Lys Ser Arg 
1010 1015 1020 

age ceg gca ccc ece gee gac aag gag gee tte gca gee gag gcc cag 
Ser Pro Ala Pro Pro Ala Asp Lys Glu Ala Phe Ala Ala Glu Ala Gin 
1025 1030 1035 1040 

aag ctg cct ggg gac cee cct tgc tgg act tec ggc ctg cce tte ccc 
Lys Leu Pro Gly Asp Pro Pro Cys Trp Thr Ser Gly Leu Pro Phe Pro 
1045 1050 1055 

gtg ecc ccc cgt gag gtg ate aag gcc tec ceg eat gcc ceg gac ccc 
Val Pro Pro Arg Glu Val lie Lys Ala Ser Pro His Ala Pro Asp Pro 
1060 1065 1070 

tea gee tte tee tac get eea cct ggt cac cca ctg ccc ctg ggc etc 
Ser Ala Phe Ser Tyr Ala Pro Pro Gly His Pro Leu Pro Leu Gly Leu 
1075 1080 1085 

eat gae act gcc egg ccc gte ctg ceg cgc cca ccc acc ate tec aac 
His Asp Thr Ala Arg Pro Val Leu Pro Arg Pro Pro Thr lie Ser Asn 
1090 1095 1100 

ceg cct ecc etc ate tec tet gcc aag cac ccc age gte etc gag agg 
Pro Pro Pro Leu lie Ser Ser Ala Lys His Pro Ser Val Leu Glu Arg 
1105 1110 1115 1120 

caa ata ggt gcc ate tee caa gga atg teg gte cag etc cac gte ecg 
Gin lie Gly Ala lie Ser Gin Gly Met Ser Val Gin Leu His Val Pro 
1125 1130 1135 

tac tea gag eat gee aag gcc ceg gtg ggc cct gte aec atg ggg ctg 
Tyr Ser Glu His Ala Lys Ala Pro Val Gly Pro Val Thr Met Gly Leu 
1140 1145 1150 

ccc ctg ccc atg gac ccc aaa aag ctg gca ccc tte age gga gtg aag 
Pro Leu Pro Met Asp Pro Lys Lys Leu Ala Pro Phe Ser Gly Val Lys 
1155 1160 1165 
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cag gag cag ctg tec cca egg ggc cag get ggg cca ccg gag age ctg 
Gin Glu Gin Leu Ser Pro Arg Gly Gin Ala Gly Pro Pro Glu Ser Leu 
1170 1175 1180 

ggg gtg eec aca gee cag gag geg tec gtg ctg aga ggg aca get ctg 
Gly Val Pro Thr Ala Gin Glu Ala Ser Val Leu Arg Gly Thr Ala Leu 
1185 1190 1195 1200 

gge tea gtt ccg ggc gga age ate ace aaa ggc att ccc age aca egg 
Gly Ser Val Pro Gly Gly Ser lie Thr Lys Gly lie Pro Ser Thr Arg 
1205 1210 1215 

gtg ccc teg gac age gcc ate aca tac cgc ggc tec ate ace cac ggc 
Val Pro Ser Asp Ser Ala lie Thr Tyr Arg Gly Ser lie Thr His Gly 
1220 1225 1230 

aeg cca get gac gtc ctg tac aag gge ace ate ace agg ate ate ggc 
Thr Pro Ala Asp Val Leu Tyr Lys Gly Thr lie Thr Arg lie lie Gly 
1235 1240 1245 

gag gac age ccg agt cgc ttg gac cgc ggc egg gag gac age ctg ccc 
Glu Asp Ser Pro Ser Arg Leu Asp Arg Gly Arg Glu Asp Ser Leu Pro 
1250 1255 1260 

aag ggc cac gtc ate tac gaa ggc aag aag gge cac gtc ttg tec tat 
Lys Gly His Val He Tyr Glu Gly Lys Lys Gly His Val Leu Ser Tyr 
1265 1270 1275 1280 

gag ggt ggc atg tct gtg ace cag tgc tec aag gag gac ggc aga age 
Glu Gly Gly Met Ser Val Thr Gin Cys Ser Lys Glu Asp Gly Arg Ser 
1285 1290 1295 

age tea gga ccc ccc cat gag acg gcc gee ccc aag cgc ace tat gac 
Ser Ser Gly Pro Pro His Glu Thr Ala Ala Pro Lys Arg Thr Tyr Asp 
1300 1305 1310 

atg atg gag ggc cgc gtg gge aga gee ate tec tea gee age ate gaa 
Met Met Glu Gly Arg Val Gly Arg Ala He Ser Ser Ala Ser He Glu 
1315 1320 1325 

ggt etc atg ggc cgt gee ate ccg ccg gag cga cac age ccc cac cac 
Gly Leu Met Gly Arg Ala He Pro Pro Glu Arg His Ser Pro His His 
1330 1335 1340 

etc aaa gag cag cac cac ate cgc ggg tee ate aca caa ggg ate eet 
Leu Lys Glu Gin His His He Arg Gly Ser He Thr Gin Gly He Pro 
1345 1350 1355 1360 

egg tec tac gtg gag gea cag gag gac tac ctg cgt egg gag gcc aag 
Arg Ser Tyr Val Glu Ala Gin Glu Asp Tyr Leu Arg Arg Glu Ala Lys 
1365 1370 1375 

etc eta aag egg gag gge acg ect ccg cec cca ccg ccc tea egg gac 
Leu Leu Lys Arg Glu Gly Thr Pro Pro Pro Pro Pro Pro Ser Arg Asp 
1380 1385 1390 

ctg ace gag gee tac aag acg cag gee ctg ggc cec ctg aag ctg aag 
Leu Thr Glu Ala Tyr Lys Thr Gin Ala Leu Gly Pro Leu Lys Leu Lys 
1395 1400 1405 
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ccg gcc cat gag ggc ctg gtg gcc acg gtg aag gag gcg ggc cgc tec 
Pro Ala His Glu Gly Leu Val Ala Thr Val Lys Glu Ala Gly Arg Ser 
1410 1415 1420 

ate cat gag ate ccg cgc gag gag ctg egg cac acg ccc gag ctg ecc 
He His Glu He Pro Arg Glu Glu Leu Arg His Thr Pro Glu Leu Pro 
1425 1430 1435 1440 

ctg gee ccg egg ccg etc aag gag ggc tee ate acg cag ggc aee ccg 
Leu Ala Pro Arg Pro Leu Lys Glu Gly Ser He Thr Gin Gly Thr Pro 
1445 1450 1455 

etc aag tac gac ace ggc gcg tec acc act ggc tee aaa aag cac gac 
Leu Lys Tyr Asp Thr Gly Ala Ser Thr Thr Gly Ser Lys Lys His Asp 
1460 1465 1470 

gta cgc tee etc ate ggc age ccc ggc egg acg ttc eca ccc gtg cac 
Val Arg Ser Leu He Gly Ser Pro Gly Arg Thr Phe Pro Pro Val His 
1475 1480 1485 

ccg ctg gat gtg atg gee gac gee egg gea ctg gaa cgt gcc tgc tac 
Pro Leu Asp Val Met Ala Asp Ala Arg Ala Leu Glu Arg Ala Cys Tyr 
1490 1495 1500 

gag gag age ctg aag age egg cca ggg ace gcc age age teg ggg ggc 
Glu Glu Ser Leu Lys Ser Arg Pro Gly Thr Ala Ser Ser Ser Gly Gly 
1505 1510 1515 1520 

tec att gcg cgc ggc gcc ccg gtc att gtg cet gag ctg ggt aag ccg 
Ser He Ala Arg Gly Ala Pro Val He Val Pro Glu Leu Gly Lys Pro 
1525 1530 1535 

egg cag age ccc ctg acc tat gag gac cac ggg gca ccc ttt gcc ggc 
Arg Gin Ser Pro Leu Thr Tyr Glu Asp His Gly Ala Pro Phe Ala Gly 
1540 1545 1550 

cac etc cca cga ggt teg ccc gtg acc atg egg gag ccc acg ccg cgc 
His Leu Pro Arg Gly Ser Pro Val Thr Met Arg Glu Pro Thr Pro Arg 
1555 1560 1565 

ctg cag gag ggc age ctt teg tec age aag gca tec cag gac cga aag 
Leu Gin Glu Gly Ser Leu Ser Ser Ser Lys Ala Ser Gin Asp Arg Lys 
1570 1575 1580 

ctg acg teg acg cet cgt gag ate gcc aag tec ccg cac age acc gtg 
Leu Thr Ser Thr Pro Arg Glu He Ala Lys Ser Pro His Ser Thr Val 
1585 1590 1595 1600 

CCC gag cac cac cca cac ecc ate teg ccc tat gag cac ctg ctt egg 
Pro Glu His His Pro His Pro He Ser Pro Tyr Glu His Leu Leu Arg 
1605 1610 1615 

ggc gtg agt ggc gtg gac ctg tat cgc age cac ate ccc ctg gcc ttc 
Gly Val Ser Gly Val Asp Leu Tyr Arg Ser His He Pro Leu Ala Phe 
1620 1625 1630 

gac ccc acc tec ata ccc cgc ggc ate cet ctg gac gca gee get gcc 
Asp Pro Thr Ser He Pro Arg Gly He Pro Leu Asp Ala Ala Ala Ala 
1635 1640 1645 
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tac tac ctg ccc cga cac ctg gcc ccc aac ccc acc tac ccg cac ctg 
Tyr Tyr Leu Pro Arg His Leu Ala Pro Asn Pro Thr Tyr Pro His Leu 
1650 1655 1660 

tac cca ccc tac etc ate cgc ggc tac ccc gac acg gcg gcg ctg gag 
Tyr Pro Pro Tyr Leu lie Arg Gly Tyr Pro Asp Thr Ala Ala Leu Glu 
1665 1670 1675 1680 

aac egg cag acc ate ate aat gae tac ate ace teg cag cag atg cac 
Asn Arg Gin Thr lie lie Asn Asp Tyr lie Thr Ser Gin Gin Met His 
1685 1690 1695 

cac aac acg gcc ace gcc atg gcc cag cga get gat atg ctg agg ggc 
His Asn Thr Ala Thr Ala Met Ala Gin Arg Ala Asp Met Leu Arg Gly 
1700 1705 1710 

etc teg ccc cgc gag tec teg ctg gca etc aac tac get gcg ggt ccc 
Leu Ser Pro Arg Glu Ser Ser Leu Ala Leu Asn Tyr Ala Ala Gly Pro 
1715 1720 1725 

cga ggc ate ate gac ctg tee caa gtg cca cac ctg eet gtg etc gtg 
Arg Gly lie lie Asp Leu Ser Gin Val Pro His Leu Pro Val Leu Val 
1730 1735 1740 

ccc ccg aca cca ggc acc cca gcc acc gee atg gac cgc ett gee tac 
Pro Pro Thr Pro Gly Thr Pro Ala Thr Ala Met Asp Arg Leu Ala Tyr 
1745 1750 1755 1760 

etc ccc acc gcg ccc cag ccc ttc age age cgc cac age age tec cca 
Leu Pro Thr Ala Pro Gin Pro Phe Ser Ser Arg His Ser Ser Ser Pro 
1765 1770 1775 

etc tec cca gga ggt eca aca cac ttg aca aaa cca acc ace acg tec 
Leu Ser Pro Gly Gly Pro Thr His Leu Thr Lys Pro Thr Thr Thr Ser 
1780 1785 1790 

teg tec gag egg gag cga gac egg gat cga gag egg gac egg gat egg 
Ser Ser Glu Arg Glu Arg Asp Arg Asp Arg Glu Arg Asp Arg Asp Arg 
1795 1800 1805 

gag egg gaa aag tec ate etc acg tec ace acg acg gtg gag cac gca 
Glu Arg Glu Lys Ser lie Leu Thr Ser Thr Thr Thr Val Glu His Ala 
1810 1815 1820 

cee ate tgg aga eet ggt aca gag cag age age ggc age age ggc age 
Pro lie Trp Arg Pro Gly Thr Glu Gin Ser Ser Gly Ser Ser Gly Ser 
1825 1830 1835 1840 

age ggc ggg ggt ggg ggc age age age cgc ccc gee tec cac tec cat 
Ser Gly Gly Gly Gly Gly Ser Ser Ser Arg Pro Ala Ser His Ser His 
1845 1850 1855 

gcc cac cag cac teg ccc ate tec eet egg ace cag gat gee etc cag 
Ala His Gin His Ser Pro lie Ser Pro Arg Thr Gin Asp Ala Leu Gin 
1860 1865 1870 

cag aga cee agt gtg ett cac aac aca ggc atg aag ggt ate ate ace 
Gin Arg Pro Ser Val Leu His Asn Thr Gly Met Lys Gly He He Thr 
1875 1880 1885 
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get gtg gag ccc age aag ccc acg gtc ctg agg tec acc tec ace tec 
Ala Val Glu Pro Ser Lys Pro Thr Val Leu Arg Ser Thr Ser Thr Ser 
1890 1895 1900 

tea ccc gtt cgc cca get gee aca tte cea cct gee aec cac tgc cca 
Ser Pro Val Arg Pro Ala Ala Thr Phe Pro Pro Ala Thr His Cys Pro 
1905 1910 1915 1920 

ctg ggc ggc acc etc gat ggg gtc tac cct acc etc atg gag ccc gtc 
Leu Gly Gly Thr Leu Asp Gly Val Tyr Pro Thr Leu Met Glu Pro Val 
1925 1930 1935 

ttg ctg ccc aag gag gcc ccc egg gtc gee egg cea gag egg ccc ega 
Leu Leu Pro Lys Glu Ala Pro Arg Val Ala Arg Pro Glu Arg Pro Arg 
1940 1945 1950 

gea gac aec ggc cat gcc tte etc gee aag ccc cea gee cgc tee ggg 
Ala Asp Thr Gly His Ala Phe Leu Ala Lys Pro Pro Ala Arg Ser Gly 
1955 1960 1965 

ctg gag ccc gcc tec tec ccc age aag ggc teg gag ccc egg ccc eta 
Leu Glu Pro Ala Ser Ser Pro Ser Lys Gly Ser Glu Pro Arg Pro Leu 
1970 1975 1980 

gtg cct cct gtc tct ggc cac gcc acc ate gee cgc ace cct gcg aag 
Val Pro Pro Val Ser Gly His Ala Thr lie Ala Arg Thr Pro Ala Lys 
1985 1990 1995 2000 

aae etc gea cct cac cac gcc age ccg gae eeg ccg gcg cca cct gcc 
Asn Leu Ala Pro His His Ala Ser Pro Asp Pro Pro Ala Pro Pro Ala 
2005 2010 2015 

teg gcc teg gac ccg cac egg gaa aag act caa agt aaa ccc ttt tec 
Ser Ala Ser Asp Pro His Arg Glu Lys Thr Gin Ser Lys Pro Phe Ser 
2020 2025 2030 

ate cag gaa ctg gaa etc egt tct ctg ggt tac cac ggc age age tac 
lie Gin Glu Leu Glu Leu Arg Ser Leu Gly Tyr His Gly Ser Ser Tyr 
2035 2040 2045 

age ccc gaa ggg gtg gag ccc gtc age cct gtg age tea ccc agt ctg 
Ser Pro Glu Gly Val Glu Pro Val Ser Pro Val Ser Ser Pro Ser Leu 
2050 2055 2060 

acc cac gac aag ggg etc ccc aag cac ctg gaa gag etc gae aag age 
Thr His Asp Lys Gly Leu Pro Lys His Leu Glu Glu Leu Asp Lys Ser 
2065 2070 2075 2080 

cac ctg gag ggg gag ctg egg ccc aag cag cea ggc ccc gtg aag ctt 
His Leu Glu Gly Glu Leu Arg Pro Lys Gin Pro Gly Pro Val Lys Leu 
2085 2090 2095 

ggc ggg gag gcc gee cac etc cca cac ctg egg ccg ctg cct gag age 
Gly Gly Glu Ala Ala His Leu Pro His Leu Arg Pro Leu Pro Glu Ser 
2100 2105 2110 

cag ccc teg tee age ccg ctg etc cag acc gee cea ggg gtc aaa ggt 
Gin Pro Ser Ser Ser Pro Leu Leu Gin Thr Ala Pro Gly Val Lys Gly 
2115 2120 2125 
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cac cag egg gtg gtc acc ctg gcc cag cac ate agt gag gtc ate aca 
His Gin Arg Val Val Thr Leu Ala Gin His lie Ser Glu Val lie Thr 
2130 2135 2140 

cag gac tac acc egg cac cac cea cag cag etc age gea cec etg ccc 
Gin Asp Tyr Thr Airg His His Pro Gin Gin Leu Ser Ala Pro Leu Pro 
2145 2150 2155 2160 

gee cce etc tac tee ttc cct ggg gee age tgc ccc gte etg gac etc 
Ala Pro Leu Tyr Ser Phe Pro Gly Ala Ser Cys Pro Val Leu Asp Leu 
2165 2170 2175 

cgc egc cca cce agt gac etc tac etc ccg ccc ccg gac cat ggt gcc 
Arg Arg Pro Pro Ser Asp Leu Tyr Leu Pro Pro Pro Asp His Gly Ala 
2180 2185 2190 

ccg gcc cgt ggc tec ccc cac age gaa ggg ggc aag agg tet cca gag 
Pro Ala Arg Gly Ser Pro His Ser Glu Gly Gly Lys Arg Ser Pro Glu 
2195 2200 2205 

cca aac aag acg teg gtc ttg ggt ggt ggt gag gac ggt att gaa ect 
Pro Asn Lys Thr Ser Val Leu Gly Gly Gly Glu Asp Gly lie Glu Pro 
2210 2215 2220 

gtg tec cca ccg gag ggc atg acg gag cca ggg cac tec egg agt get 
Val Ser Pro Pro Glu Gly Met Thr Glu Pro Gly His Ser Arg Ser Ala 
2225 2230 2235 2240 

gtg tac ccg ctg ctg tac egg gat ggg gaa cag acg gag ccc age agg 
Val Tyr Pro Leu Leu Tyr Arg Asp Gly Glu Gin Thr Glu Pro Ser Arg 
2245 2250 2255 

atg ggc tec aag tet cca ggc aac ace age cag ccg cca gcc ttc ttc 
Met Gly Ser Lys Ser Pro Gly Asn Thr Ser Gin Pro Pro Ala Phe Phe 
2260 2265 2270 

age aag ctg ace gag age aac tec gee atg gtc aag tec aag aag eaa 
Ser Lys Leu Thr Glu Ser Asn Ser Ala Met Val Lys Ser Lys Lys Gin 
2275 2280 2285 

gag ate aac aag aag ctg aac acc cac aac egg aat gag cct gaa tac 
Glu lie Asn Lys Lys Leu Asn Thr His Asn Arg Asn Glu Pro Glu Tyr 
2290 2295 2300 

aat ate age cag cct ggg acg gag ate ttc aat atg ccc gcc ate ace 
Asn lie Ser Gin Pro Gly Thr Glu lie Phe Asn Met Pro Ala lie Thr 
2305 2310 2315 2320 

gga aca ggc ctt atg ace tat aga age cag geg gtg cag gaa cat gcc 
Gly Thr Gly Leu Met Thr Tyr Arg Ser Gin Ala Val Gin Glu His Ala 
2325 2330 2335 

age ace aac atg ggg ctg gag gcc ata att aga aag gea etc atg ggt 
Ser Thr Asn Met Gly Leu Glu Ala lie lie Arg Lys Ala Leu Met Gly 
2340 2345 2350 

aaa tat gac cag tgg gaa gag tec ccg ccg etc age gee aat get ttt 
Lys Tyr Asp Gin Trp Glu Glu Ser Pro Pro Leu Ser Ala Asn Ala Phe 
2355 2360 2365 
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aac cct ctg aat gcc agt gcc age ctg ccc get get atg ccc ata acc 7153 
Asn Pro Leu Asn Ala Ser Ala Ser Leu Pro Ala Ala Met Pro lie Thr 
2370 2375 2380 

get get gac gga egg agt gae eae aca etc acc teg cea ggt gge ggc 7201 
Ala Ala Asp Gly Arg Ser Asp His Thr Leu Thr Ser Pro Gly Gly Gly 
2385 2390 2395 2400 

ggg aag gcc aag gtc tct ggc aga ccc age age cga aaa gee aag tec 7249 
Gly Lys Ala Lys Val Ser Gly Arg Pro Ser Ser Arg Lys Ala Lys Ser 
2405 2410 2415 

ccg gcc ceg ggc ctg gca tct ggg gac egg eca ccc tct gtc tec tea 72 97 

Pro Ala Pro Gly Leu Ala Ser Gly Asp Arg Pro Pro Ser Val Ser Ser 
2420 2425 2430 

gtg cac teg gag gga gac tgc aac cgc egg acg ccg etc acc aac cge 7345 
Val His Ser Glu Gly Asp Cys Asn Arg Arg Thr Pro Leu Thr Asn Arg 
2435 2440 2445 

gtg tgg gag gac agg ccc teg tec gea ggt tec acg eca tte ccc tac 73 93 

Val Trp Glu Asp Arg Pro Ser Ser Ala Gly Ser Thr Pro Phe Pro Tyr 
2450 2455 2460 

aac ccc ctg ate atg egg ctg eag gcg ggt gtc atg get tee eca cec 7441 
Asn Pro Leu lie Met Arg Leu Gin Ala Gly Val Met Ala Ser Pro Pro 
2465 2470 2475 2480 

eca ccg ggc etc ccc gcg gge age ggg ccc etc get ggc cee eae eae 7489 
Pro Pro Gly Leu Pro Ala Gly Ser Gly Pro Leu Ala Gly Pro His His 
2485 2490 2495 

gee tgg gae gag gag ccc aag eca ctg etc tgc teg cag tac gag aca 7537 
Ala Trp Asp Glu Glu Pro Lys Pro Leu Leu Cys Ser Gin Tyr Glu Thr 
2500 2505 2510 

etc tec gae age gag tga etcagaacag ggcggggggg ggegggeggt 7585 
Leu Ser Asp Ser Glu * 
2515 

gteaggtccc agcgagceac aggaacggee etgcaggage ggggeggctg eegactccce 764 5 

caaccaagga aggagcccet gagtcegcct gcgectceat ccatctgtce gtccagagcc 7705 

ggcateettg cctgtctaaa gccttaaeta agaetcecge ccegggetgg eectgtgeag 7765 

accttactca ggggatgttt acctggtgct cgggaaggga ggggaagggg ccggggaggg 782 5 

ggeacggeag gegtgtggea gccaeaeaca ggeggceagg gcggecaggg acccaaagca 78 8 5 

ggatgaceac gcaectccac gccaetgect cccecgaatg catttggaac caaagtetaa 7945 

actgagcteg cagcccccgc gccetcccte cgccteccat ecegcttagc gctetggaea 8005 

gatggaegea ggccctgtee ageeeecagt gcgctegttc eggtccecae agactgecee 8065 

agccaaegag attgctggaa aceaagtcag geeaggtggg eggacaaaag ggccaggtgc 8125 

ggcetggggg gaaeggatge tccgaggact ggactgtttt ttteacacat cgttgecgca 8185 

gcggtgggaa ggaaaggcag atgtaaatga tgtgttggtt tacagggtat atttttgata 8245 

cettcaatga attaattcag atgttttacg caaggaagga cttacccagt attactgctg 8305 

etgtgetttt gatctetget taeegtteaa gaggegtgtg caggeegaea gtcggtgaee 8365 

ceateaeteg caggaecaag ggggegggga etgctegtca egeccegetg tgtcctceet 8425 

ecctecctte ettgggeaga atgaattega tgegtattet gtggecgcca tttgegcagg 8485 

gtggtggtat tetgteattt acacacgteg ttetaattaa aaagcgaatt ataetecaaa 8545 

aaaaaaaaaa aaaaaa 8561 



<210> 5 
<211> 2517 



<212> PRT 

<213> Homo sapiens 
<400> 5 

Met Ser Gly Ser Thr 

1 5 
Pro Arg Tyr Pro Pro 
20 

Thr His Thr Asp Val 
35 

Tyr Ala Ser His Leu 
50 

Arg Pro Ser Leu Leu 
65 

Glu Leu His Leu Arg 
85 

Lys Ser Glu Met Glu 
100 

Leu Pro Asp Pro Leu 
115 

Pro Ala Gly Ser Glu 
130 

Leu Glu Pro Val Ser 
145 

Glu Leu Val Pro Pro 
165 

Asp Arg Val Asp Arg 
180 

Leu Lys Lys Lys Gin 
195 

Glu Pro Glu Lys Pro 
210 

Ser Leu Val Gin lie 
225 

Ala His Arg lie Leu 
245 

Tyr Asn Gin Pro Ser 
260 

Asn Gin Ala Met Arg 
275 

His Ala Arg Lys Gin 
290 

Leu Met Glu Ala Leu 
305 

Arg Arg Arg Ala Lys 
325 

Phe Pro Glu lie Arg 
340 

Arg Val Gly Gin Arg 
355 

Glu His Glu Val Ser 
370 

Leu Glu Lys Gin Met 
385 

Asp Ala Asp Gin Gin 
405 

Ala Asp Pro Met Lys 
420 

Ser Glu Gin Glu Lys 
435 
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Gin 


Leu 


Val 


Ala 


Gin 
10 


Thr 


Trp 


Arg 


Ala 


Thr 
15 


Glu 


His 


Ser 


Leu 


Ser 
25 


Tyr 


Pro 


Val 


Gin 


He 
30 


Ala 


Arg 


Gly 


Leu 


Leu 
40 


Glu 


Tyr 


Gin 


His 


His 
45 


Ser 


Arg 


Asp 


Ser 


Pro 
55 


Gly 


Ser 


He 


He 


Gin 
60 


Pro 


Gin 


Arg 


Arg 


Ser 


Glu 


Phe 


Gin 


Pro 


Gly 


Asn 


Glu 


Arg 


Ser 


Gin 


70 










75 










80 


Pro 


Glu 


Ser 


His 


Ser 
90 


Tyr 


Leu 


Pro 


Glu 


Leu 
95 


Gly 


Phe 


He 


Glu 


Ser 
105 


Lys 


Arg 


Pro 


Arg 


Leu 
110 


Glu 


Leu 


Leu 


Arg 


Pro 
12 0 


Ser 


Pro 


Leu 


Leu 


Ala 
125 


Thr 


Gly 


Gin 


Asp 


Leu 

135 


Thr 


Lys 


Asp 


Arg 


Ser 
140 


Leu 


Thr 


Gly 


Lys 


Pro 


Pro 


Ser 


Pro 


Pro 


His 


Thr 


Asp 


Pro 


Glu 


Leu 


150 










155 










160 


Arg 


Leu 


Ser 


Lys 


Glu 
170 


Glu 


Leu 


He 


Gin 


Asn 
175 


Met 


Glu 


He 


Thr 


Met 
185 


Val 


Glu 


Gin 


Gin 


He 
190 


Ser 


Lys 


Gin 


Gin 


Leu 
200 


Glu 


Glu 


Glu 


Ala 


Ala 
205 


Lys 


Pro 


Pro 


Val 


Ser 
215 


Pro 


Pro 


Pro 


He 


Glu 
220 


Ser 


Lys 


His 


Arg 


He 


Tyr 


Asp 


Glu 


Asn 


Arg 


Lys 


Lys 


Ala 


Glu 


Ala 


230 










235 










240 


Glu 


Gly 


Leu 


Gly 


Pro 
250 


Gin 


Val 


Glu 


Leu 


Pro 
255 


Leu 


Asp 


Thr 


Arg 


Gin 
265 


Tyr 


His 


Glu 


Asn 


He 
270 


Lys 


He 


Lys 


Lys 


Leu 
280 


He 


Leu 


Tyr 


Phe 


Lys 
285 


Arg 


Arg 


Asn 


Trp 


Lys 
295 


Gin 


Lys 


Phe 


Cys 


Gin 
300 


Arg 


Tyr 


Asp 


Gin 


Glu 


Lys 


Lys 


Val 


Glu 


Arg 


He 


Glu 


Asn 


Asn 


Pro 


310 










315 










320 


Glu 


Ser 


Lys 


Val 


Arg 
330 


Glu 


Tyr 


Tyr 


Glu 


Lys 
335 


Gin 


Lys 


Gin 


Arg 


Glu 
345 


Leu 


Gin 


Glu 


Arg 


Met 
350 


Gin 


Ser 


Gly 


Ser 


Gly 
360 


Leu 


Ser 


Met 


Ser 


Ala 
365 


Ala 


Arg 


Ser 


Glu 


He 
375 


He 


Asp 


Gly 


Leu 


Ser 
380 


Glu 


Gin 


Glu 


Asn 


Arg 


Gin 


Leu 


Ala 


Val 


He 


Pro 


Pro 


Met 


Leu 


Tyr 


390 










395 










400 


Arg 


He 


Lys 


Phe 


He 
410 


Asn 


Met 


Asn 


Gly 


Leu 
415 


Met 


Val 


Tyr 


Lys 


Asp 
425 


Arg 


Gin 


Val 


Met 


Asn 
430 


Met 


Trp 


Glu 


Thr 


Phe 
440 


Arg 


Glu 


Lys 


Phe 


Met 
445 


Gin 


His 


Pro 



Lys Asn Phe Gly 
450 

Glu Cys Val Leu 
465 

Ser Leu Val Arg 

Gin Gin Gin Gin 
500 

Pro Arg Ser Ser 
515 

Ala Glu Lys Glu 
530 

Leu Leu Lys Glu 
545 

Lys Glu Ala Val 

Arg Arg Lys Gly 
580 

Glu Glu Ala lie 
595 

Leu Asn Glu Ser 
610 

Lys Gly Leu Leu 
625 

Val Gly Ser Lys 

Lys Lys Arg Gin 
660 

Met Glu Lys Glu 
675 

Ala Ala Ser Glu 
690 

Met Glu Ala Ser 
705 

Ala Glu Ala Leu 

Ser Gly Pro Ala 
740 

Ser Pro His Thr 
755 

Pro Pro Ala Thr 
770 

Pro Pro Arg Arg 
785 

Glu Ala Thr Gly 

Ala Pro Pro Pro 
820 

Ala Ala Pro Pro 
835 

Glu Glu Leu Ala 
850 

Glu Cys Thr Glu 
865 

Glu Ala Ala Glu 

Glu Gly Gly Ser 
900 

Gin Asp Ser Asp 
915 



Leu lie Ala Ser 
455 

Tyr Tyr Tyr Leu 
470 

Arg Ser Tyr Arg 
485 

Gin Gin Gin Gin 

Gin Glu Glu Lys 
520 

Glu Glu Lys Pro 
535 

Lys Thr Asp Asp 
550 

Ala Ser Lys Gly 
565 

Arg lie Thr Arg 

Thr Pro Gin Gin 
600 

Ser Arg Trp Thr 
615 

Glu His Gly Arg 
630 

Thr Val Ser Gin 
645 

Asn Leu Asp Glu 

Arg Asn Ala Arg 
680 

Glu Ala Ala Phe 
695 

Gly Val Ser Gly 
710 

His Ala Ser Gly 
725 

Thr Val Asn Asn 

Glu Ala Ala Lys 
760 

Leu Gly Ala Asp 
775 

Thr Ser Arg Ala 
790 

Ala Pro Thr Pro 
805 

Val Val Pro Lys 

Val Glu Glu Gly 
840 

Val Asp Thr Gly 
855 

Glu Ala Glu Glu 
870 

Ala Thr Ala Glu 
885 

Gly Arg Ala Thr 

Ser Ser Ala Thr 
920 
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Phe Leu Glu Arg 
460 

Thr Lys Lys Asn 
475 

Arg Arg Gly Lys 
490 

Gin Gin Gin Gin 
505 

Asp Glu Lys Glu 

Glu Val Glu Asn 
540 

Thr Ser Gly Glu 
555 

Arg Lys Thr Ala 
570 

Ser Met Ala Asn 
585 

Ser Ala Glu Leu 

Glu Glu Glu Met 
620 

Asn Trp Ser Ala 
635 

Cys Lys Asn Phe 
650 

lie Leu Gin Gin 
665 

Arg Lys Lys Lys 

Pro Pro Val Val 
700 

Asn Glu Glu Glu 
715 

Asn Glu Val Pro 
730 

Ser Ser Asp Thr 
745 

Asp Thr Gly Gin 

Gly Pro Pro Pro 
780 

Pro lie Glu Pro 
795 

Pro Pro Ala Pro 
810 

Glu Glu Lys Glu 
825 

Glu Glu Gin Lys 

Lys Ala Glu Glu 
860 

Gly Pro Ala Lys 
875 

Gly Ala Leu Lys 
890 

Thr Ala Lys Ser 
905 

Cys Ser Ala Asp 



Lys Thr Val Ala 

Glu Asn Tyr Lys 
480 

Ser Gin Gin Gin 
495 

Gin Gin Pro Met 
510 

Lys Glu Lys Glu 
525 

Asp Lys Glu Asp 

Asp Asn Asp Glu 
560 

Asn Ser Gin Gly 
575 

Glu Ala Asn Ser 
590 

Ala Ser Met Glu 
605 

Glu Thr Ala Lys 

lie Ala Arg Met 
640 

Tyr Phe Asn Tyr 
655 

His Lys Leu Lys 
670 

Lys Ala Pro Ala 
685 

Glu Asp Glu Glu 

Met Val Glu Glu 
720 

Arg Gly Glu Cys 
735 

Glu Ser lie Pro 
750 

Asn Gly Pro Lys 
765 

Gly Pro Pro Thr 

Thr Pro Ala Ser 
800 

Pro Ser Pro Ser 
815 

Glu Glu Thr Ala 
830 

Pro Pro Ala Ala 
845 

Pro Val Lys Ser 

Gly Lys Asp Ala 
880 

Ala Glu Lys Lys 
895 

Ser Gly Ala Pro 
910 

Glu Val Asp Glu 
925 
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Ala Glu Gly 


Gly 


Asp 


Lys 


Asn Arg Leu Leu Ser Pro Arg 


Pro 


Ser 


Leu 


930 








935 940 








Leu Thr Pro 


Thr 


Gly 


Asp 


Pro Arg Ala Asn Ala Ser Pro 


Gin 


Lys 


Pro 


94 5 






950 


955 






960 


Leu Asp Leu 


Lys 


Gin 


Leu 


Lys Gin Arg Ala Ala Ala He 


Pro 


Pro 


He 






965 




970 




975 




Gin Val Thr 


Lys 


Val 


His 


Glu Pro Pro Arg Glu Asp Ala 


Ala 


Pro 


Thr 




980 






985 


990 






Lys Pro Ala 


Pro 


Pro 


Ala 


Pro Pro Pro Pro Gin Asn Leu 


Gin 


Pro 


Glu 


995 








1000 1005 






Ser Asp Ala 


Pro 


Gin 


Gin 


Pro Gly Ser Ser Pro Arg Gly Lys 


Ser 


Arg 


1010 








1015 1020 








Ser Pro Ala 


Pro 


Pro 


Ala 


Asp Lys Glu Ala Phe Ala Ala 


Glu 


Ala 


Gin 


1025 






1030 1035 






1040 


Lys Leu Pro 


Gly 


Asp 


Pro 


Pro Cys Trp Thr Ser Gly Leu 


Pro 


Phe 


Pro 






1045 


1050 




1055 


Val Pro Pro 


Arg 


Glu 


Val 


He Lys Ala Ser Pro His Ala 


Pro 


Asp 


Pro 




1060 




1065 


1070 




Ser Ala Phe 


Ser 


Tyr 


Ala 


Pro Pro Gly His Pro Leu Pro 


Leu Gly 


Leu 


1075 






1080 108£ 








His Asp Thr 


Ala 


Arg 


Pro 


Val Leu Pro Arg Pro Pro Thr 


He 


Ser 


Asn 


1090 








1095 1100 








Pro Pro Pro 


Leu 


He 


Ser 


Ser Ala Lys His Pro Ser Val 


Leu 


Glu 


Arg 


1105 






1110 1115 






1120 


Gin lie Gly 


Ala 


He 


Ser 


Gin Gly Met Ser Val Gin Leu 


His 


Val 


Pro 






1125 


1130 




1135 


Tyr Ser Glu 


His 


Ala 


Lys 


Ala Pro Val Gly Pro Val Thr 


Met 


Gly Leu 




1140 




1145 


1150 




Pro Leu Pro 


Met 


Asp 


Pro 


Lys Lys Leu Ala Pro Phe Ser 


Gly 


Val 


Lys 


1155 






1160 1165 






Gin Glu Gin 


Leu 


Ser 


Pro Arg Gly Gin Ala Gly Pro Pro 


Glu 


Ser 


Leu 


1170 








1175 1180 








Gly Val Pro 


Thr 


Ala 


Gin 


Glu Ala Ser Val Leu Arg Gly Thr Ala 


Leu 


1185 






1190 1195 






1200 


Gly Ser Val 


Pro 


Gly Gly 


Ser He Thr Lys Gly He Pro 


Ser 


Thr 


Arg 






1205 


1210 




1215 


Val Pro Ser 


Asp 


Ser 


Ala 


He Thr Tyr Arg Gly Ser He 


Thr 


His 


Gly 




1220 




1225 


1230 




Thr Pro Ala 


Asp 


Val 


Leu 


Tyr Lys Gly Thr He Thr Arg 


He 


He 


Gly 


1235 






1240 1245 






Glu Asp Ser 


Pro 


Ser 


Arg 


Leu Asp Arg Gly Arg Glu Asp 


Ser 


Leu 


Pro 


1250 








1255 1260 








Lys Gly His Val 


He 


Tyr Glu Gly Lys Lys Gly His Val 


Leu 


Ser 


Tyr 


1265 






1270 1275 






1280 


Glu Gly Gly Met 


Ser 


Val 


Thr Gin Cys Ser Lys Glu Asp 


Gly Arg 


Ser 






128£ 




1290 




1295 


Ser Ser Gly 


Pro 


Pro 


His 


Glu Thr Ala Ala Pro Lys Arg 


Thr 


Tyr 


Asp 




1300 




1305 


1310 




Met Met Glu 


Gly Arg Val 


Gly Arg Ala He Ser Ser Ala 


Ser 


He 


Glu 


1315 






1320 1325 






Gly Leu Met 


Gly Arg Ala 


He Pro Pro Glu Arg His Ser 


Pro 


His 


His 


1330 








1335 1340 








Leu Lys Glu 


Gin 


His 


His 


He Arg Gly Ser He Thr Gin 


Gly 


He 


Pro 


1345 






1350 1355 






1360 


Arg Ser Tyr 


Val 


Glu 


Ala 


Gin Glu Asp Tyr Leu Arg Arg 


Glu 


Ala 


Lys 






1365 


1370 




1375 


Leu Leu Lys 


Arg 


Glu 


Gly Thr Pro Pro Pro Pro Pro Pro 


Ser 


Arg 


Asp 




138( 


3 




1385 


1390 




Leu Thr Glu 


Ala 


Tyr Lys 


Thr Gin Ala Leu Gly Pro Leu 


Lys 


Leu Lys • 



1395 1400 1405 
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Pro Ala His Glu Gly Leu Val Ala Thr Val Lys Glu Ala Gly Arg Ser 

1410 1415 1420 

lie His Glu lie Pro Arg Glu Glu Leu Arg His Thr Pro Glu Leu Pro 
1425 1430 1435 1440 

Leu Ala Pro Arg Pro Leu Lys Glu Gly Ser lie Thr Gin Gly Thr Pro 

1445 1450 1455 

Leu Lys Tyr Asp Thr Gly Ala Ser Thr Thr Gly Ser Lys Lys His Asp 

1460 1465 1470 

Val Arg Ser Leu lie Gly Ser Pro Gly Arg Thr Phe Pro Pro Val His 

1475 1480 1485 

Pro Leu Asp Val Met Ala Asp Ala Arg Ala Leu Glu Arg Ala Cys Tyr 

1490 1495 1500 

Glu Glu Ser Leu Lys Ser Arg Pro Gly Thr Ala Ser Ser Ser Gly Gly 
1505 1510 1515 1520 

Ser lie Ala Arg Gly Ala Pro Val lie Val Pro Glu Leu Gly Lys Pro 

1525 1530 1535 

Arg Gin Ser Pro Leu Thr Tyr Glu Asp His Gly Ala Pro Phe Ala Gly 

1540 1545 1550 

His Leu Pro Arg Gly Ser Pro Val Thr Met Arg Glu Pro Thr Pro Arg 

1555 1560 1565 

Leu Gin Glu Gly Ser Leu Ser Ser Ser Lys Ala Ser Gin Asp Arg Lys 

1570 1575 1580 

Leu Thr Ser Thr Pro Arg Glu lie Ala Lys Ser Pro His Ser Thr Val 
1585 1590 1595 1600 

Pro Glu His His Pro His Pro lie Ser Pro Tyr Glu His Leu Leu Arg 

1605 1610 1615 

Gly Val Ser Gly Val Asp Leu Tyr Arg Ser His lie Pro Leu Ala Phe 

1620 1625 1630 

Asp Pro Thr Ser lie Pro Arg Gly lie Pro Leu Asp Ala Ala Ala Ala 

1635 1640 1645 

Tyr Tyr Leu Pro Arg His Leu Ala Pro Asn Pro Thr Tyr Pro His Leu 

1650 1655 1660 

Tyr Pro Pro Tyr Leu lie Arg Gly Tyr Pro Asp Thr Ala Ala Leu Glu 
1665 1670 1675 1680 

Asn Arg Gin Thr lie lie Asn Asp Tyr lie Thr Ser Gin Gin Met His 

1685 1690 1695 

His Asn Thr Ala Thr Ala Met Ala Gin Arg Ala Asp Met Leu Arg Gly 

1700 1705 1710 

Leu Ser Pro Arg Glu Ser Ser Leu Ala Leu Asn Tyr Ala Ala Gly Pro 

1715 1720 1725 

Arg Gly lie lie Asp Leu Ser Gin Val Pro His Leu Pro Val Leu Val 

1730 1735 1740 

Pro Pro Thr Pro Gly Thr Pro Ala Thr Ala Met Asp Arg Leu Ala Tyr 
1745 1750 1755 1760 

Leu Pro Thr Ala Pro Gin Pro Phe Ser Ser Arg His Ser Ser Ser Pro 

1765 1770 1775 

Leu Ser Pro Gly Gly Pro Thr His Leu Thr Lys Pro Thr Thr Thr Ser 

1780 1785 1790 

Ser Ser Glu Arg Glu Arg Asp Arg Asp Arg Glu Arg Asp Arg Asp Arg 

1795 1800 1805 

Glu Arg Glu Lys Ser lie Leu Thr Ser Thr Thr Thr Val Glu His Ala 

1810 1815 1820 

Pro lie Trp Arg Pro Gly Thr Glu Gin Ser Ser Gly Ser Ser Gly Ser 
1825 1830 1835 1840 

Ser Gly Gly Gly Gly Gly Ser Ser Ser Arg Pro Ala Ser His Ser His 

1845 1850 1855 

Ala His Gin His Ser Pro lie Ser Pro Arg Thr Gin Asp Ala Leu Gin 

1860 1865 1870 

Gin Arg Pro Ser Val Leu His Asn Thr Gly Met Lys Gly lie lie Thr 
1875 1880 1885 
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Ala Val Glu Pro Ser Lys Pro Thr Val Leu Arg Ser Thr Ser Thr Ser 

1890 1895 1900 

Ser Pro Val Arg Pro Ala Ala Thr Phe Pro Pro Ala Thr His Cys Pro 
1905 1910 1915 1920 

Leu Gly Gly Thr Leu Asp Gly Val Tyr Pro Thr Leu Met Glu Pro Val 

1925 1930 1935 

Leu Leu Pro Lys Glu Ala Pro Arg Val Ala Arg Pro Glu Arg Pro Arg 

1940 1945 1950 

Ala Asp Thr Gly His Ala Phe Leu Ala Lys Pro Pro Ala Arg Ser Gly 

1955 1960 1965 

Leu Glu Pro Ala Ser Ser Pro Ser Lys Gly Ser Glu Pro Arg Pro Leu 

1970 1975 1980 

Val Pro Pro Val Ser Gly His Ala Thr lie Ala Arg Thr Pro Ala Lys 
1985 1990 1995 2000 

Asn Leu Ala Pro His His Ala Ser Pro Asp Pro Pro Ala Pro Pro Ala 

2005 2010 2015 

Ser Ala Ser Asp Pro His Arg Glu Lys Thr Gin Ser Lys Pro Phe Ser 

2020 2025 2030 

lie Gin Glu Leu Glu Leu Arg Ser Leu Gly Tyr His Gly Ser Ser Tyr 

2035 2040 2045 

Ser Pro Glu Gly Val Glu Pro Val Ser Pro Val Ser Ser Pro Ser Leu 

2050 2055 2060 

Thr His Asp Lys Gly Leu Pro Lys His Leu Glu Glu Leu Asp Lys Ser 
2065 2070 2075 2080 

His Leu Glu Gly Glu Leu Arg Pro Lys Gin Pro Gly Pro Val Lys Leu 

2085 2090 2095 

Gly Gly Glu Ala Ala His Leu Pro His Leu Arg Pro Leu Pro Glu Ser 

2100 2105 2110 

Gin Pro Ser Ser Ser Pro Leu Leu Gin Thr Ala Pro Gly Val Lys Gly 

2115 2120 2125 

His Gin Arg Val Val Thr Leu Ala Gin His lie Ser Glu Val lie Thr 

2130 2135 2140 

Gin Asp Tyr Thr Arg His His Pro Gin Gin Leu Ser Ala Pro Leu Pro 
2145 2150 2155 2160 

Ala Pro Leu Tyr Ser Phe Pro Gly Ala Ser Cys Pro Val Leu Asp Leu 

2165 2170 2175 

Arg Arg Pro Pro Ser Asp Leu Tyr Leu Pro Pro Pro Asp His Gly Ala 

2180 2185 2190 

Pro Ala Arg Gly Ser Pro His Ser Glu Gly Gly Lys Arg Ser Pro Glu 

2195 2200 2205 

Pro Asn Lys Thr Ser Val Leu Gly Gly Gly Glu Asp Gly lie Glu Pro 

2210 2215 2220 

Val Ser Pro Pro Glu Gly Met Thr Glu Pro Gly His Ser Arg Ser Ala 
2225 2230 2235 2240 

Val Tyr Pro Leu Leu Tyr Arg Asp Gly Glu Gin Thr Glu Pro Ser Arg 

2245 2250 2255 

Met Gly Ser Lys Ser Pro Gly Asn Thr Ser Gin Pro Pro Ala Phe Phe 

2260 2265 2270 

Ser Lys Leu Thr Glu Ser Asn Ser Ala Met Val Lys Ser Lys Lys Gin 

2275 2280 2285 

Glu lie Asn Lys Lys Leu Asn Thr His Asn Arg Asn Glu Pro Glu Tyr 

2290 2295 2300 

Asn lie Ser Gin Pro Gly Thr Glu lie Phe Asn Met Pro Ala He Thr 
2305 2310 2315 2320 

Gly Thr Gly Leu Met Thr Tyr Arg Ser Gin Ala Val Gin Glu His Ala 

2325 2330 2335 

Ser Thr Asn Met Gly Leu Glu Ala He lie Arg Lys Ala Leu Met Gly 

2340 2345 2350 

Lys Tyr Asp Gin Trp Glu Glu Ser Pro Pro Leu Ser Ala Asn Ala Phe 
2355 2360 2365 
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Asn Pro Leu Asn Ala Ser Ala Ser Leu Pro Ala Ala Met Pro lie Thr 

2370 2375 2380 

Ala Ala Asp Gly Arg Ser Asp His Thr Leu Thr Ser Pro Gly Gly Gly 
2385 2390 2395 2400 

Gly Lys Ala Lys Val Ser Gly Arg Pro Ser Ser Arg Lys Ala Lys Ser 

2405 2410 2415 

Pro Ala Pro Gly Leu Ala Ser Gly Asp Arg Pro Pro Ser Val Ser Ser 

2420 2425 2430 

Val His Ser Glu Gly Asp Cys Asn Arg Arg Thr Pro Leu Thr Asn Arg 

2435 2440 2445 

Val Trp Glu Asp Arg Pro Ser Ser Ala Gly Ser Thr Pro Phe Pro Tyr 

2450 2455 2460 

Asn Pro Leu lie Met Arg Leu Gin Ala Gly Val Met Ala Ser Pro Pro 
2465 2470 2475 2480 

Pro Pro Gly Leu Pro Ala Gly Ser Gly Pro Leu Ala Gly Pro His His 

2485 2490 2495 

Ala Trp Asp Glu Glu Pro Lys Pro Leu Leu Cys Ser Gin Tyr Glu Thr 

2500 2505 2510 

Leu Ser Asp Ser Glu 
2515 



<210> 6 

<211> 8388 

<212> DNA 

<213> Mus musculus 



<220> 
<221> CDS 

<222> (626) . . . (8047) 



<221> misc_f eature 
<222> (1) . . . (8388) 
<223> n = A,T,C or G 



<400> 6 

cttaaaaaaa aaacccttac ttgtggttaa aggaaaagaa ataaagactt aggaaaaatg 60 

taattttcca gggggtacct acacccaaga catatggttc tcaagaggna ctcagcatat 12 0 

cactttgatt ccagagaagc tacaaaagtc attaccaaac tccaggctgg aaagcagtgc 180 

tcatactaaa tatttaaaca tttaaagacc tgattaagag acatcaaagg ctttatacca 240 

ggggcacacc aacagagaca ggctttttca aggataattt atgtctgccc attgtcttct 3 00 

ggcttaggag acatagaggg aaacatcacc taggaaaacc agtaaccaat gtgtaccatc 3 60 

caggagttat tctatgacaa aaccaaaagt tttgttcttg tgtacttctc tgtgcaccat 420 

ctttctatat ctatttagaa aacaaaacaa attttggtaa cacgcttgtg tataaagagc 480 

aggacagcgg tgtcacagat caacctagaa agtaattatt taacgagtaa atgactcata 54 0 

taggacaagg caagctgtga ctttcaacct gttctgtctc gtgccgaatt cggcacgagc 600 

caaagcctac ctggacccta ccacc atg tea gga tec aca cag cct gtg gca 652 
Met Ser Gly Ser Thr Gin Pro Val Ala 



cag aca tgg egg get get gag eee egc tae cca eee eat ggc ate tee 700 
Gin Thr Trp Arg Ala Ala Glu Pro Arg Tyr Pro Pro His Gly lie Ser 
10 15 20 25 



tac ccg gtg cag ata gcc egg tec cac acg gac gtg ggg ctg ctt gag 
Tyr Pro Val Gin lie Ala Arg Ser His Thr Asp Val Gly Leu Leu Glu 



tac caa cae eae ecc egt gac tac ace tea cac ctg tea eee ggt tec 
Tyr Gin His His Pro Arg Asp Tyr Thr Ser His Leu Ser Pro Gly Ser 
45 50 55 



796 
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ate ate cag cca cag agg agg egg cee tea etg ctg tea gag tte eag 
lie lie Gin Pro Gin Arg Arg Arg Pro Ser Leu Leu Ser Glu Phe Gin 



cet ggg agt gaa egg tet cag gag etc eac ctg cge eet gag tee egc 
Pro Gly Ser Glu Arg Ser Gin Glu Leu His Leu Arg Pro Glu Ser Arg 



aeg tte ctg ect gag ctg ggc aag cee gac ata gaa tte acc gag age 
Thr Phe Leu Pro Glu Leu Gly Lys Pro Asp lie Glu Phe Thr Glu Ser 
90 95 100 105 

aag cge cee cge ctg gag eta eta cee gat acc ctg ctg cge cca tea 
Lys Arg Pro Arg Leu Glu Leu Leu Pro Asp Thr Leu Leu Arg Pro Ser 
110 115 120 

ccc ctg ctg gcc act ggg cag ccg agt ggg tct gaa gac ctt acc aag 
Pro Leu Leu Ala Thr Gly Gin Pro Ser Gly Ser Glu Asp Leu Thr Lys 
125 130 135 

gac cgt age ctg gca ggc aag ctg gag cet gtg tea cet eec agt ccc 
Asp Arg Ser Leu Ala Gly Lys Leu Glu Pro Val Ser Pro Pro Ser Pro 
140 145 150 

ccg cac get gac ect gag eta gag etg gcg cca tet ega ctg tec aag 
Pro His Ala Asp Pro Glu Leu Glu Leu Ala Pro Ser Arg Leu Ser Lys 
155 160 165 

gag gag ctg ate cag aac atg gac cge gtg gac cgt gag ate acc atg 
Glu Glu Leu lie Gin Asn Met Asp Arg Val Asp Arg Glu lie Thr Met 
170 175 180 185 

gta gag eag eag ate tee aag ctg aag aag aag cag caa eag ttg gag 
Val Glu Gin Gin lie Ser Lys Leu Lys Lys Lys Gin Gin Gin Leu Glu 
190 195 200 

gag gag gee gcc aag ccg ccc gaa ccc gag aag ect gtg teg cca cca 
Glu Glu Ala Ala Lys Pro Pro Glu Pro Glu Lys Pro Val Ser Pro Pro 

205 210 215 

ccc ata gaa tea aag cac cga age ctg gtc cag ate ate tac gat gag 
Pro lie Glu Ser Lys His Arg Ser Leu Val Gin lie lie Tyr Asp Glu 
220 225 230 

aac egg aag aaa gcc gaa gcc gca cac egg ate eta gaa ggc ctg ggg 
Asn Arg Lys Lys Ala Glu Ala Ala His Arg lie Leu Glu Gly Leu Gly 
235 240 245 

ccc cag gtg gag ctg cet etg tac aac cag ccg tct gac aca egc cag 
Pro Gin Val Glu Leu Pro Leu Tyr Asn Gin Pro Ser Asp Thr Arg Gin 
250 255 260 265 

tac cat gaa aac ate aaa ata aac cag gcg atg egg aag aag ctg ate 
Tyr His Glu Asn lie Lys lie Asn Gin Ala Met Arg Lys Lys Leu lie 
270 275 280 

ttg tac ttt aag egg agg aac cac gcg egc aag cag tgg gaa eag cge 
Leu Tyr Phe Lys Arg Arg Asn His Ala Arg Lys Gin Trp Glu Gin Arg 
285 290 295 
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ttc tgc cag cgc tat gac cag etc atg gag gcg tgg gag aag aag gta 
Phe Cys Gin Arg Tyr Asp Gin Leu Met Glu Ala Trp Glu Lys Lys Val 
300 305 310 

gag cgc ata gag aac aat ccg cga agg agg gcc aag gag age aag gtg 
Glu Arg lie Glu Asn Asn Pro Arg Arg Arg Ala Lys Glu Ser Lys Val 
315 320 325 

agg gag tac tac gag aaa cag ttc ccg gag ate cgc aag cag egg gag 
Arg Glu Tyr Tyr Glu Lys Gin Phe Pro Glu lie Arg Lys Gin Arg Glu 
330 335 340 345 

etg cag gag cgc atg eag age agg gtg ggc cag egt gge agt ggg etc 
Leu Gin Glu Arg Met Gin Ser Arg Val Gly Gin Arg Gly Ser Gly Leu 
350 355 360 

tec atg teg get gee ege agt gag eat gag gtt tct gag ate att gat 
Ser Met Ser Ala Ala Arg Ser Glu His Glu Val Ser Glu lie lie Asp 
365 370 375 

ggc ttg tct gag cag gag aac ctg gag aag cag atg cgc cag ctg gcc 
Gly Leu Ser Glu Gin Glu Asn Leu Glu Lys Gin Met Arg Gin Leu Ala 
380 385 390 

gtg ate ccg ccc atg ttg tac gae gcg gae eag cag agg ate aag tte 
Val lie Pro Pro Met Leu Tyr Asp Ala Asp Gin Gin Arg lie Lys Phe 
395 400 405 

ate aac atg aat gga etc atg gat gac ccc atg aag gte tac aag gac 
lie Asn Met Asn Gly Leu Met Asp Asp Pro Met Lys Val Tyr Lys Asp 
410 415 420 425 

cgt cag gtt acc aac atg tgg age gag eag gag agg gac ace ttc cgt 
Arg Gin Val Thr Asn Met Trp Ser Glu Gin Glu Arg Asp Thr Phe Arg 
430 435 440 

gag aag ttt atg cag eac ect aag aac ttt ggc ctg att gcc tea tte 
Glu Lys Phe Met Gin His Pro Lys Asn Phe Gly Leu lie Ala Ser Phe 
445 450 455 

etg gag aga aag aeg gtc get gag tgt gte etc tat tac tac ctg ace 
Leu Glu Arg Lys Thr Val Ala Glu Cys Val Leu Tyr Tyr Tyr Leu Thr 
460 465 470 

aag aag aat gaa aat tae aag age ttg gtg agg egg age tat egg cgc 
Lys Lys Asn Glu Asn Tyr Lys Ser Leu Val Arg Arg Ser Tyr Arg Arg 
475 480 485 

cgt gge aag age cag eag eag cag eag eag caa caa cag eag eag cag 
Arg Gly Lys Ser Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin 
490 495 500 505 

cag cag atg gca egg age age cag gag gag aag gag gag aag gag aag 
Gin Gin Met Ala Arg Ser Ser Gin Glu Glu Lys Glu Glu Lys Glu Lys 
510 515 520 

gag aag gag gcc gac aag gag gaa gag aag eag gat gcg gag aac gag 
Glu Lys Glu Ala Asp Lys Glu Glu Glu Lys Gin Asp Ala Glu Asn Glu 
525 530 535 



24 



aag gaa gaa 
Lys Glu Glu 
540 

aac cat gag 
Asn His Glu 
555 

age caa ggc 
Ser Gin Gly 
570 

gcc aac cat 
Ala Asn His 



tec atg gag 
Ser Met Glu 



aca gca aag 
Thr Ala Lys 
620 

gcc cgc atg 
Ala Arg Met 
635 

ttc aac tac 
Phe Asn Tyr 
650 

aag eta aag 
Lys Leu Lys 



acc cca get 
Thr Pro Ala 



gac gaa gag 
Asp Glu Glu 
700 

geg gag gag 
Ala Glu Glu 
715 

gtt ggg gag 
Val Gly Glu 
730 

gag agt gtc 
Glu Ser Val 



aaa ccc act 
Lys Pro Thr 



etc age aag 
Leu Ser Lys 



aaa gag gcc 
Lys Glu Ala 



cgc cgc aaa 
Arg Arg Lys 
575 

gag gag aca 
Glu Glu Thr 
590 

atg aac gag 
Met Asn Glu 
605 

aaa ggc etc 
Lys Gly Leu 



gtg ggc tec 
Val Gly Ser 



aag aag agg 
Lys Lys Arg 
655 

atg gag aag 
Met Glu Lys 
670 

geg gcg age 
Ala Ala Ser 
685 

atg gaa gca 
Met Glu Ala 



gca gaa gcc 
Ala Glu Ala 



tgc agt ggc 
Cys Ser Gly 
735 

cca tec ceg 
Pro Ser Pro 
750 

ggc act gaa 
Gly Thr Glu 
765 



gag aag aca 
Glu Lys Thr 
545 

gtg gcc tee 
Val Ala Ser 
560 

ggc cgt ate 
Gly Arg lie 



gcc acc cca 
Ala Thr Pro 



agt tet egc 
Ser Ser Arg 
610 

ctg gaa cat 
Leu Glu His 
625 

aag ace gtg 
Lys Thr Val 
640 

cag aac ctg 
Gin Asn Leu 



gag agg aac 
Glu Arg Asn 



gag gag aca 
Glu Glu Thr 
690 

tea ggc gca 
Ser Gly Ala 
705 

tea cag gee 
Ser Gin Ala 
720 

cca get get 
Pro Ala Ala 



cgt tea gaa 
Arg Ser Glu 



gca ttg ccc 
Ala Leu Pro 
770 



gac gac act 
Asp Asp Thr 



aaa ggc cgc 
Lys Gly Arg 
565 

acg cgc tec 
Thr Arg Ser 
580 

cag caa agt 
Gin Gin Ser 
595 

tgg act gag 
Trp Thr Glu 



ggg agg aac 
Gly Arg Asn 



tec cag tgt 
Ser Gin Cys 
645 

gac gaa ate 
Asp Glu lie 
660 

get egg agg 
Ala Arg Arg 
675 

gcc ttc cca 
Ala Phe Pro 



agt gcc aat 
Ser Ala Asn 



tct ggg aat 
Ser Gly Asn 
725 

gtc aac aac 
Val Asn Asn 
740 

gcc atg aag 
Ala Met Lys 
755 

get gcc acc 
Ala Ala Thr 



tct ggc gag 
Ser Gly Glu 
550 

aaa act gee 
Lys Thr Ala 



atg gee aac 
Met Ala Asn 



tea gag ctg 
Ser Glu Leu 
600 

gaa gag atg 
Glu Glu Met 
615 

tgg tea gee 
Trp Ser Ala 
630 

aag aac ttc 
Lys Asn Phe 



ctt cag cag 
Leu Gin Gin 



aag aag aag 
Lys Lys Lys 
680 

ect gcc get 
Pro Ala Ala 
695 

gag gaa gag 
Glu Glu Glu 
710 

gag gtt ccc 
Glu Val Pro 



age tet gat 
Ser Ser Asp 



gac act ggg 
Asp Thr Gly 
760 

cag cca ect 
Gin Pro Pro 
775 



gac 2284 
Asp 



aac 2332 
Asn 



gag 23 80 

Glu 

585 

get 2428 
Ala 



gag 24 76 

Glu 



att 2524 
He 



tac 2572 
Tyr 



cac 2620 

His 

665 

aag 2 668 

Lys 



gag 2716 
Glu 



ctg 2764 
Leu 



aga 2 812 

Arg 



act 2860 

Thr 

745 

cet 2908 
Pro 



gtt 2956 
Val 
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cct cct cca gaa gaa ccg gca gta gcc cct get gag ccc tec cca gtc 
Pro Pro Pro Glu Glu Pro Ala Val Ala Pro Ala Glu Pro Ser Pro Val 
780 785 790 

cct gat gcc agt ggc cca cca tec cca gag cct tec cat cac ctg ccg 
Pro Asp Ala Ser Gly Pro Pro Ser Pro Glu Pro Ser His His Leu Pro 
795 800 805 

cac ece egg eta etg tgg aca agg atg aac aag aag eee egg ctg etc 
His Pro Arg Leu Leu Trp Thr Arg Met Asn Lys Lys Pro Arg Leu Leu 
810 815 820 825 

eag etc ccc aga cag agg atg cca agg age aga agt etg agg ccg agg 
Gin Leu Pro Arg Gin Arg Met Pro Arg Ser Arg Ser Leu Arg Pro Arg 
830 835 840 

aga teg atg tgg gaa aag cca gag gag ccc gag gcc tet gag gag eee 
Arg Ser Met Trp Glu Lys Pro Glu Glu Pro Glu Ala Ser Glu Glu Pro 
845 850 855 

ccg gag agt gta aag agt gac cac aag gag gag acc gag gaa gag cct 
Pro Glu Ser Val Lys Ser Asp His Lys Glu Glu Thr Glu Glu Glu Pro 
860 865 870 

gaa gac aaa gcc aag ggc aca gag gcc att gaa act gtg tet gag gea 
Glu Asp Lys Ala Lys Gly Thr Glu Ala lie Glu Thr Val Ser Glu Ala 
875 880 885 

cca ctt aag gtg gag gag get ggt age aag gea get gtg ace aag ggt 
Pro Leu Lys Val Glu Glu Ala Gly Ser Lys Ala Ala Val Thr Lys Gly 
890 895 900 905 

tec age tea ggt gee acc eag gac agt gae ttc agt gee acc tgc agt 
Ser Ser Ser Gly Ala Thr Gin Asp Ser Asp Phe Ser Ala Thr Cys Ser 
910 915 920 

gcc gat gag gtg gac gaa ccc gaa gga ggt gac aag ggc agg ctg ctg 
Ala Asp Glu Val Asp Glu Pro Glu Gly Gly Asp Lys Gly Arg Leu Leu 
925 930 935 

tea cca agg ccc age etc etc acc ccg get gga gat ccc egg gcc agt 
Ser Pro Arg Pro Ser Leu Leu Thr Pro Ala Gly Asp Pro Arg Ala Ser 
940 945 950 

acc teg ccc cag aag ccg ctg gac ctg aag cag ctg aag cag cga gca 
Thr Ser Pro Gin Lys Pro Leu Asp Leu Lys Gin Leu Lys Gin Arg Ala 
955 960 965 

gcc gcc ate ccc cct ate cag gtc acc aag gtc cat gag ccc ccc egg 
Ala Ala He Pro Pro He Gin Val Thr Lys Val His Glu Pro Pro Arg 
970 975 980 985 

gag gac aca gta ccc cca aag cca gtt ccc cct gtg cct cca ccc acg 
Glu Asp Thr Val Pro Pro Lys Pro Val Pro Pro Val Pro Pro Pro Thr 
990 995 1000 

cag cac eta cag cca gag ggt gac gtg tet cag eag teg gga gga agt 
Gin His Leu Gin Pro Glu Gly Asp Val Ser Gin Gin Ser Gly Gly Ser 
1005 1010 1015 
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cca cgt ggc aag tec cgc age cca gtg cct cct gcc gag aaa gag gca 
Pro Arg Gly Lys Ser Arg Ser Pro Val Pro Pro Ala Glu Lys Glu Ala 
1020 1025 1030 

gag aaa ccc gca ttc ttt ccg get ttc cca act gag ggc cca aag eta 
Glu Lys Pro Ala Phe Phe Pro Ala Phe Pro Thr Glu Gly Pro Lys Leu 
1035 1040 1045 

ccg act gag ccc cca cgc tgg tea teg ggc etg ccc ttc eee ate cct 
Pro Thr Glu Pro Pro Arg Trp Ser Ser Gly Leu Pro Phe Pro lie Pro 
1050 1055 1060 1065 

cca egg gag gtg ate aag act tee cca cac gee get gac ccc tet gcc 
Pro Arg Glu Val lie Lys Thr Ser Pro His Ala Ala Asp Pro Ser Ala 
1070 1075 1080 

ttc tee tac aca eee ccc ggt eae eeg ctg cct ctg ggc etc eac gat 
Phe Ser Tyr Thr Pro Pro Gly His Pro Leu Pro Leu Gly Leu His Asp 
1085 1090 1095 

agt gee egg ccc gtc ctg cca cgt ccc ccc ate tet aac ccc cca ccc 
Ser Ala Arg Pro Val Leu Pro Arg Pro Pro lie Ser Asn Pro Pro Pro 
1100 1105 1110 

etc ate tec tet gcc aag cat ccc ggc gta ctt gag agg cag etg ggt 
Leu lie Ser Ser Ala Lys His Pro Gly Val Leu Glu Arg Gin Leu Gly 
1115 1120 1125 

gee ate tee eag cag ggg atg tea gtc cag ctt cgt gtg ect cac tea 
Ala lie Ser Gin Gin Gly Met Ser Val Gin Leu Arg Val Pro His Ser 
1130 1135 1140 1145 

gag cat gcc aag gcc ccc atg ggc cct etc ace atg ggg ctg ccc ctt 
Glu His Ala Lys Ala Pro Met Gly Pro Leu Thr Met Gly Leu Pro Leu 
1150 1155 1160 

gee gtg gac ect aag aag etg ggg aca gca ctg ggc tec gcc ace agt 
Ala Val Asp Pro Lys Lys Leu Gly Thr Ala Leu Gly Ser Ala Thr Ser 
1165 1170 1175 

gga age ate acc aag ggc etc ccc agt acc egg get gca gac ggc eee 
Gly Ser lie Thr Lys Gly Leu Pro Ser Thr Arg Ala Ala Asp Gly Pro 
1180 1185 1190 

age tac aga ggc tet ate ace eae ggc acg ccc gca gac gtc etc tac 
Ser Tyr Arg Gly Ser lie Thr His Gly Thr Pro Ala Asp Val Leu Tyr 
1195 1200 1205 

aag ggt acc ate age agg ate gtc ggt gag gac age cca agt cge ctt 
Lys Gly Thr lie Ser Arg lie Val Gly Glu Asp Ser Pro Ser Arg Leu 
1210 1215 1220 1225 

gac egg gca cga gag gac acc ctg ccc aag ggc cat gtc ate tat gag 
Asp Arg Ala Arg Glu Asp Thr Leu Pro Lys Gly His Val lie Tyr Glu 
1230 1235 1240 

ggc aag aaa ggc cac gtc eta tee tat gaa ggt ggt atg tec gtg tea 
Gly Lys Lys Gly His Val Leu Ser Tyr Glu Gly Gly Met Ser Val Ser 
1245 1250 1255 
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cag tgc tct aag gag gat gga agg age age teg ggc cca ccc cat gag 
Gin Cys Ser Lys Glu Asp Gly Arg Ser Ser Ser Gly Pro Pro His Glu 
12S0 1265 1270 

act gcc gcc cct aaa cgc acc tat gac atg atg gag ggc cgt gta ggc 
Thr Ala Ala Pro Lys Arg Thr Tyr Asp Met Met Glu Gly Arg Val Gly 
1275 1280 1285 

agg act gtc acc tea gcc age ata gag gga etc atg ggc cgc gcc ate 
Arg Thr Val Thr Ser Ala Ser lie Glu Gly Leu Met Gly Arg Ala lie 
1290 1295 1300 1305 

cct gag cag cae age eee cae etc aag gag cag eat eac ate cga gge 
Pro Glu Gin His Ser Pro His Leu Lys Glu Gin His His lie Arg Gly 
1310 1315 1320 

tec ate aeg caa gge ate ecg agg tee tat gtg gag geg cag gag gac 
Ser lie Thr Gin Gly lie Pro Arg Ser Tyr Val Glu Ala Gin Glu Asp 
1325 1330 1335 

tac tta egg egg gag gcc aag etc ttg aag cga gaa ggg aca cca cca 
Tyr Leu Arg Arg Glu Ala Lys Leu Leu Lys Arg Glu Gly Thr Pro Pro 
1340 1345 1350 

ccc cca cca cca ect egg gac ctg act gag acc tac aag ccc egg cce 
Pro Pro Pro Pro Pro Arg Asp Leu Thr Glu Thr Tyr Lys Pro Arg Pro 
1355 1360 1365 

ctg gac cct ctg ggt ccc ctg aag ctg aag ceg act cae gag ggt gtg 
Leu Asp Pro Leu Gly Pro Leu Lys Leu Lys Pro Thr His Glu Gly Val 
1370 1375 1380 1385 

gta gca act gtg aag gag geg ggc cgc tct ate cat gag ate ecg aga 
Val Ala Thr Val Lys Glu Ala Gly Arg Ser lie His Glu lie Pro Arg 
1390 1395 1400 

gag gag ctg egc cgc aca cct gag eta eee ctg gea cca egg cct ctg 
Glu Glu Leu Arg Arg Thr Pro Glu Leu Pro Leu Ala Pro Arg Pro Leu 
1405 1410 1415 

aag gag ggt tec ate acc cag ggc acc cca etc aag tac gac tct ggg 
Lys Glu Gly Ser lie Thr Gin Gly Thr Pro Leu Lys Tyr Asp Ser Gly 
1420 1425 1430 

gca eee tec act ggc ace aag aaa cae gac gtg egc tee ate ate ggc 
Ala Pro Ser Thr Gly Thr Lys Lys His Asp Val Arg Ser lie lie Gly 
1435 1440 1445 

age eee ggc egg cct tte cct gee ctg cae ceg ctg gac ata atg get 
Ser Pro Gly Arg Pro Phe Pro Ala Leu His Pro Leu Asp lie Met Ala 
1450 1455 1460 1465 

gac gcc egg gca ctg gag cgt gcc tgc tat gaa gag agt ctg aag age 
Asp Ala Arg Ala Leu Glu Arg Ala Cys Tyr Glu Glu Ser Leu Lys Ser 
1470 1475 1480 

egg tea ggg acc age agt ggt gca ggg gge tec ate aca cgt ggg get 
Arg Ser Gly Thr Ser Ser Gly Ala Gly Gly Ser lie Thr Arg Gly Ala 
1485 1490 1495 
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cca gtc gtc gtg cct gaa ctg ggc aag cca egg caa age cca ctg act 
Pro Val Val Val Pro Glu Leu Gly Lys Pro Arg Gin Ser Pro Leu Thr 
1500 1505 1510 

tac gaa gac cac ggg gca ccc ttc acc agt cac ctg cca cgt ggc tec 
Tyr Glu Asp His Gly Ala Pro Phe Thr Ser His Leu Pro Arg Gly Ser 
1515 1520 1525 

cct gtg acc acg agg gag ccc acg cca cgc ctt cag gaa ggc age etc 
Pro Val Thr Thr Arg Glu Pro Thr Pro Arg Leu Gin Glu Gly Ser Leu 
1530 1535 1540 1545 

eta tec age aag gcg tec cag gac egg aag ctg aca tet aca ccc egg 
Leu Ser Ser Lys Ala Ser Gin Asp Arg Lys Leu Thr Ser Thr Pro Arg 
1550 1555 1560 

gag ate gee aag tec cca cac age act gtg ccc gag cac cac cct cac 
Glu lie Ala Lys Ser Pro His Ser Thr Val Pro Glu His His Pro His 
1565 1570 1575 

ecc ate tee ccc tat gag cac ttg etc egg ggc gtg act ggt gtg gac 
Pro lie Ser Pro Tyr Glu His Leu Leu Arg Gly Val Thr Gly Val Asp 
1580 1585 1590 

ctg tac cgt ggt cac ate cca ttg gee ttt gac ecc ace tec ata ccc 
Leu Tyr Arg Gly His lie Pro Leu Ala Phe Asp Pro Thr Ser lie Pro 
1595 1600 1605 

cga ggg ate cct ctg gaa gca gca gcc gca gcc tac tac ctg ccc egg 
Arg Gly lie Pro Leu Glu Ala Ala Ala Ala Ala Tyr Tyr Leu Pro Arg 
1610 1615 1620 1625 

cac ttg gee ccc age ccc acc tac cca cac ctg tac cca cct tac etc 
His Leu Ala Pro Ser Pro Thr Tyr Pro His Leu Tyr Pro Pro Tyr Leu 
1630 1635 1640 

ate cgc ggc tac cct gac acg gcg gcc ctg gag aac cgc cag acc ate 
lie Arg Gly Tyr Pro Asp Thr Ala Ala Leu Glu Asn Arg Gin Thr lie 
1645 1650 1655 

ate aat gac tac ate acc teg cag cag atg cac cac aac get gcc tee 
lie Asn Asp Tyr lie Thr Ser Gin Gin Met His His Asn Ala Ala Ser 
1660 1665 1670 

gcc atg gcc cag cgt get gac atg ctg agg ggt ctg tea ccg cga gag 
Ala Met Ala Gin Arg Ala Asp Met Leu Arg Gly Leu Ser Pro Arg Glu 
1675 1680 1685 

tec teg ctg gee etc aat tat gcc get ggc cca aga ggc att ate gac 
Ser Ser Leu Ala Leu Asn Tyr Ala Ala Gly Pro Arg Gly lie lie Asp 
1690 1695 1700 1705 

ctg tec caa gtg cca cac ctg ccc gtg ctg gtg cca cca acg cca ggc 
Leu Ser Gin Val Pro His Leu Pro Val Leu Val Pro Pro Thr Pro Gly 
1710 1715 1720 

ace cct gcc acc gee ate gac cgc ctt gcc tac etc ccc act gcg ccc 
Thr Pro Ala Thr Ala He Asp Arg Leu Ala Tyr Leu Pro Thr Ala Pro 
1725 ~ 1730 1735 
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cca ccc ttc age age cgc cac agt age tea ecg ctg tec cca gga ggc 
Pro Pro Phe Ser Ser Arg His Ser Ser Ser Pro Leu Ser Pro Gly Gly 
1740 1745 1750 

CCC act cac eta get aaa cca act gee aca tct tea teg gag egg gaa 
Pro Thr His Leu Ala Lys Pro Thr Ala Thr Ser Ser Ser Glu Arg Glu 
1755 1760 1765 

egg gaa cgt gag egg gaa ega gae aag tec ate etc acg tet aec act 
Arg Glu Arg Glu Arg Glu Arg Asp Lys Ser lie Leu Thr Ser Thr Thr 
1770 1775 1780 1785 

aca gtg gag eat gca ccc ate tgg aga eet ggt acg gag cag age age 
Thr Val Glu His Ala Pro lie Trp Arg Pro Gly Thr Glu Gin Ser Ser 
1790 1795 1800 

ggg get ggg ggc age age cgc ccc gee tec cac ace cac cag cac teg 
Gly Ala Gly Gly Ser Ser Arg Pro Ala Ser His Thr His Gin His Ser 
1805 1810 1815 

eec ate tec eee egg ace cag gac gee ttg cag cag agg ccc agt gtg 
Pro lie Ser Pro Arg Thr Gin Asp Ala Leu Gin Gin Arg Pro Ser Val 
1820 1825 1830 

ctg cac aac acg age atg aag ggc gtg gtc aec tec gtg gaa ccc ggc 
Leu His Asn Thr Ser Met Lys Gly Val Val Thr Ser Val Glu Pro Gly 
1835 1840 1845 

acg eee acg gtc ctg agg tgg gee agg tec ace tec ace tet teg cet 
Thr Pro Thr Val Leu Arg Trp Ala Arg Ser Thr Ser Thr Ser Ser Pro 
1850 1855 1860 1865 

gtc cgc cca get gee aca ttc cca cet gee aec cac tgc cca ett ggt 
Val Arg Pro Ala Ala Thr Phe Pro Pro Ala Thr His Cys Pro Leu Gly 
1870 1875 1880 

ggc acc ett gaa ggg gtc tac eet ace etc atg gag ccc gtc ctg tta 
Gly Thr Leu Glu Gly Val Tyr Pro Thr Leu Met Glu Pro Val Leu Leu 
1885 1890 1895 

eec aag gag ace tct egg gtc gee egg eee gag egg gee egg gtg gae 
Pro Lys Glu Thr Ser Arg Val Ala Arg Pro Glu Arg Ala Arg Val Asp 
1900 1905 1910 

get ggc cat gcc ttt ett acc aaa ccc ccg ggc egg gag ccc gee tec 
Ala Gly His Ala Phe Leu Thr Lys Pro Pro Gly Arg Glu Pro Ala Ser 
1915 1920 1925 

tea eec age aag age tec gag eee ega tec eta gca eee ccc age tec 
Ser Pro Ser Lys Ser Ser Glu Pro Arg Ser Leu Ala Pro Pro Ser Ser 
1930 1935 1940 1945 

age cac aca gcc ate gcc cgc acc cca gca aag aac ett gca ccc cac 
Ser His Thr Ala lie Ala Arg Thr Pro Ala Lys Asn Leu Ala Pro His 
1950 1955 1960 

cat gee agt ccg gac ccg ccg gcg ccc acc teg gcc tea gat ctg cac 
His Ala Ser Pro Asp Pro Pro Ala Pro Thr Ser Ala Ser Asp Leu His 
1965 1970 1975 
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cga gaa aag act caa agt aaa ccc ttt tec ate cag gaa ttg gaa etc 
Arg Glu Lys Thr Gin Ser Lys Pro Phe Ser lie Gin Glu Leu Glu Leu 
1980 1985 1990 

cgt tot ctg ggt tac cac agt gga get ggc tac age ccc gat ggg gtg 
Arg Ser Leu Gly Tyr His Ser Gly Ala Gly Tyr Ser Pro Asp Gly Val 
1995 2000 2005 

gag ccc ate age ecg gtg age tec ccc age ctg ace eac gae aag ggg 
Glu Pro lie Ser Pro Val Ser Ser Pro Ser Leu Thr His Asp Lys Gly 
2010 2015 2020 2025 

etc tec aaa ect ctg gaa gag eta gag aag age cac ttg gaa ggg gag 
Leu Ser Lys Pro Leu Glu Glu Leu Glu Lys Ser His Leu Glu Gly Glu 
2030 2035 2040 

ctg egg cac aag cag eea ggc cee atg aag etc age gcg gag get gee 
Leu Arg His Lys Gin Pro Gly Pro Met Lys Leu Ser Ala Glu Ala Ala 
2045 2050 2055 

cat etc eea cat ctg egg eea ctg ccc gag age cag eee tea tee age 
His Leu Pro His Leu Arg Pro Leu Pro Glu Ser Gin Pro Ser Ser Ser 
2060 2065 2070 

eea etc etc cag act gee eca ggc ate aaa ggt eac cag agg gtg gtc 
Pro Leu Leu Gin Thr Ala Pro Gly lie Lys Gly His Gin Arg Val Val 
2075 2080 2085 

acc ctg get cag eac ate age gag gte att aeg cag gae tac acg cge 
Thr Leu Ala Gin His lie Ser Glu Val lie Thr Gin Asp Tyr Thr Arg 
2090 2095 2100 2105 

cac cac ccg cag cag etc agt ggc ccc ctt cee gee cct etc tac tee 
His His Pro Gin Gin Leu Ser Gly Pro Leu Pro Ala Pro Leu Tyr Ser 
2110 2115 2120 

ttt eee gga gee age tge ect gtc ctg gat ctt cge cge eea ccc agt 
Phe Pro Gly Ala Ser Cys Pro Val Leu Asp Leu Arg Arg Pro Pro Ser 
2125 2130 2135 

gae etc tac etc eca ccc ccc gae eat ggc acc cea gee egg gga tec 
Asp Leu Tyr Leu Pro Pro Pro Asp His Gly Thr Pro Ala Arg Gly Ser 
2140 2145 2150 

cee cac agt gaa ggg ggc aaa agg tec cea gaa ccc age aaa aca teg 
Pro His Ser Glu Gly Gly Lys Arg Ser Pro Glu Pro Ser Lys Thr Ser 
2155 2160 2165 

gtc Ctg gge age age gag gat gee att gag cct gtg tec eca cea gag 
Val Leu Gly Ser Ser Glu Asp Ala lie Glu Pro Val Ser Pro Pro Glu 
2170 2175 2180 2185 

ggc atg act gag cea gga cat get egg age act gcg tac cea ctg ctg 
Gly Met Thr Glu Pro Gly His Ala Arg Ser Thr Ala Tyr Pro Leu Leu 
2190 2195 2200 

tat ega gae ggg gaa cag ggc gag ccc agg atg ggt eta gag tct eca 
Tyr Arg Asp Gly Glu Gin Gly Glu Pro Arg Met Gly Leu Glu Ser Pro 
2205 2210 2215 
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ggc aac acc age cag ccg cca acc ttc ttc agt aag ctg act gag age 
Gly Asn Thr Ser Gin Pro Pro Thr Phe Phe Ser Lys Leu Thr Glu Ser 
2220 2225 2230 

aac tec gee atg gtg aag teg aag aag cag gag ate aac aag aaa etc 
Asn Ser Ala Met Val Lys Ser Lys Lys Gin Glu lie Asn Lys Lys Leu 
2235 2240 2245 

aac acc cac aac egg aac gag cca gaa tac aat att ggc cag cct ggg 
Asn Thr His Asn Arg Asn Glu Pro Glu Tyr Asn lie Gly Gin Pro Gly 
2250 2255 2260 2265 

acg gaa ate ttc aac atg ccc gcc ate act gga gca ggc ctt atg acc 
Thr Glu lie Phe Asn Met Pro Ala lie Thr Gly Ala Gly Leu Met Thr 
2270 2275 2280 

tgt aga age cag geg gtg caa gaa cac gee age acc aac atg ggg eta 
Cys Arg Ser Gin Ala Val Gin Glu His Ala Ser Thr Asn Met Gly Leu 
2285 2290 2295 

gag gee att att aga aag gca etc atg ggt aaa tat gat cag tgg gaa 
Glu Ala lie lie Arg Lys Ala Leu Met Gly Lys Tyr Asp Gin Trp Glu 
2300 2305 2310 

gag ccc ccg ccg etc ggc gee aat get ttt aac cct ctg aat gee age 
Glu Pro Pro Pro Leu Gly Ala Asn Ala Phe Asn Pro Leu Asn Ala Ser 
2315 2320 2325 

gcc agt ctg ccc get get get atg ccc ata acc act get gac gga egg 
Ala Ser Leu Pro Ala Ala Ala Met Pro He Thr Thr Ala Asp Gly Arg 
2330 2335 2340 2345 

agt gac cac gca etc aee teg cca ggt gga ggt ggg aaa gcc aag gtc 
Ser Asp His Ala Leu Thr Ser Pro Gly Gly Gly Gly Lys Ala Lys Val 
2350 2355 2360 

tet ggc aga cct age age ega aaa gcc aag teg eca gca cca ggc eta 
Ser Gly Arg Pro Ser Ser Arg Lys Ala Lys Ser Pro Ala Pro Gly Leu 
2365 2370 2375 

geg tec gga gac ega cec cct tct gtc tee tea gta eae tea gag ggg 
Ala Ser Gly Asp Arg Pro Pro Ser Val Ser Ser Val His Ser Glu Gly 
2380 2385 2390 

gac tgc aat cgc ega aca cca etc acc aac cgt gtg tgg gag gac egg 
Asp Cys Asn Arg Arg Thr Pro Leu Thr Asn Arg Val Trp Glu Asp Arg 
2395 2400 2405 

CCC tea tct gca ggg tec acg eca ttc eee tac aac cct ttg att atg 
Pro Ser Ser Ala Gly Ser Thr Pro Phe Pro Tyr Asn Pro Leu He Met 
2410 2415 - 2420 2425 

agg eta cag gca ggt gtc atg gcc tec ccg ccc cca cct ggc ctt geg 
Arg Leu Gin Ala Gly Val Met Ala Ser Pro Pro Pro Pro Gly Leu Ala 
2430 2435 2440 

gca ggc age ggg ccc eta get ggt ccc cac cac gee tgg gat gag gag 
Ala Gly Ser Gly Pro Leu Ala Gly Pro His His Ala Trp Asp Glu Glu 
2445 2450 2455 
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ccc aag cca ctg ctg tgt tea cag tat gag aca etc teg gac age gag 8044 
Pro Lys Pro Leu Leu Cys Ser Gin Tyr Glu Thr Leu Ser Asp Ser Glu 
2460 2465 2470 

tga ccacggattg ggggggageg gtgccaggtc ccgcacaagg cagaagcagc 8097 



ccagcatgga geagacagct gctgaetecc gagactgagg aaggagcccc tgagtetgcc 8157 

tgcgcgtcea tecgtncgte gtncaetcat ctgtccatcc agagctggea ttctgcctgt 8217 

ctaaagcctt aaetaagact tceaceccgg gctggceetg egcagtgaec ttacactcag 82 77 

gggattgttt aecttggtgc tcganaaggg ggagtggaca ggaaggggag ggacaagccg 8337 

ggccangagg gggggggaca ancaattcgt gtgtcaagtc gcactcntgc t 83 88 

<210> 7 

<211> 2473 

<212> PRT 

<213> Mus mus cuius 



<400> 7 





Met 


Ser 


Gly 


Ser 


Thr 


Gin 


Pro 


Val 


Ala 


Gin 


Thr 


Trp 


Arg 


Ala 


Ala 


Glu 




1 








5 










10 










15 






Pro 


Arg 


Tyr 


Pro 


Pro 


His 


Gly 


He 


Ser 


Tyr 


Pro 


Val 


Gin 


He 


Ala 


Arg 










20 










25 
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Ser 


His 


Thr 


Asp 


Val 


Gly 


Leu 


Leu 


Glu 


Tyr 


Gin 


His 


His 


Pro 


Arg 


Asp 








35 










40 










45 








;L 


Tyr 


Thr 


Ser 


His 


Leu 


Ser 


Pro 


Gly 


Ser 


He 


He 


Gin 


Pro 


Gin 


Arg 


Arg 






50 










55 










60 










w 


Arg 


Pro 


Ser 


Leu 


Leu 


Ser 


Glu 


Phe 


Gin 


Pro 


Gly 


Ser 


Glu 


Arg 


Ser 


Gin 




65 










70 










75 










80 


in 


Glu 


Leu 


His 


Leu 


Arg 


Pro 


Glu 


Ser 


Arg 


Thr 


Phe 


Leu 


Pro 


Glu 


Leu 


Gly 


w 










85 










90 










95 






Lys 


Pro 


Asp 


He 


Glu 


Phe 


Thr 


Glu 


Ser 


Lys 


Arg 


Pro 


Arg 


Leu 


Glu 


Leu 


o 








100 










105 










110 








Leu 


Pro 


Asp 


Thr 


Leu 


Leu 


Arg 


Pro 


Ser 


Pro 


Leu 


Leu 


Ala 


Thr 


Gly 


Gin 








115 










12 0 










125 










Pro 


Ser 


Gly 


Ser 


Glu 


Asp 


Leu 


Thr 


Lys 


Asp 


Arg 


Ser 


Leu 


Ala 


Gly 


Lys 






130 










135 










140 












Leu 


Glu 


Pro 


val 


Ser 


Pro 


Pro 


Ser 


Pro 


Pro 


His 


Ala 


Asp 


Pro 


Glu 


Leu 




145 










150 










155 










160 




Glu 


Leu 


Ala 


Pro 


Ser 
165 


Arg 


Leu 


Ser 


Lys 


Glu 
170 


Glu 


Leu 


He 


Gin 


Asn 
175 


Met 




Asp 


Arg 


Val 


Asp 
180 


Arg 


Glu 


He 


Thr 


Met 
185 


Val 


Glu 


Gin 


Gin 


He 
190 


Ser 


Lys 




Leu 


Lys 


Lys 
195 


Lys 


Gin 


Gin 


Gin 


Leu 
200 


Glu 


Glu 


Glu 


Ala 


Ala 
205 


Lys 


Pro 


Pro 




Glu 


Pro 
210 


Glu 


Lys 


Pro 


Val 


Ser 
215 


Pro 


Pro 


Pro 


He 


Glu 
220 


Ser 


Lys 


His 


Arg 




Ser 


Leu 


Val 


Gin 


He 


He 


Tyr 


Asp 


Glu 


Asn 


Arg 


Lys 


Lys 


Ala 


Glu 


Ala 




225 










230 










235 










240 




Ala 


His 


Arg 


He 


Leu 
245 


Glu 


Gly 


Leu 


Gly 


Pro 
250 


Gin 


Val 


Glu 


Leu 


Pro 
255 


Leu 




Tyr 


Asn 


Gin 


Pro 
260 


Ser 


Asp 


Thr 


Arg 


Gin 
265 


Tyr 


His 


Glu 


Asn 


He 
270 


Lys 


He 




Asn 


Gin 


Ala 
275 


Met 


Arg 


Lys 


Lys 


Leu 
280 


He 


Leu 


Tyr 


Phe 


Lys 
285 


Arg 


Arg 


Asn 




His 


Ala 
290 


Arg 


Lys 


Gin 


Trp 


Glu 
295 


Gin 


Arg 


Phe 


Cys 


Gin 
300 


Arg 


Tyr 


Asp 


Gin 




Leu 


Met 


Glu 


Ala 


Trp 


Glu 


Lys 


Lys 


Val 


Glu 


Arg 


He 


Glu 


Asn 


Asn 


Pro 
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305 










310 










315 










320 


Arg 


Arg 


Arg 


Ala 


Lys 


Glu 


Ser 


Lys 


Val 


Arg 


Glu 


Tyr 


Tyr 


Glu 


Lys 


Gin 










325 










330 










335 




Phe 


Pro 


Glu 


He 


Arg 


Lys 


Gin 


Arg 


Glu 


Lea 


Gin 


Glu 


Arg 


Met 


Gin 


Ser 








340 










345 










350 






Arg 


Val 


Gly 


Gin 


Arg 


Gly 


Ser 


Gly 


Leu 


Ser 


Met 


Ser 


Ala 


Ala 


Arg 


Ser 






355 










360 










365 








Glu 


His 


Glu 


Val 


Ser 


Glu 


He 


He 


Asp 


Gly 


Leu 


Ser 


Glu 


Gin 


Glu 


Asn 




370 










375 










380 










Leu 


Glu 


Lys 


Gin 


Met 


Arg 


Gin 


Leu 


Ala 


Val 


He 


Pro 


Pro 


Met 


Leu 


Tyr 


385 










390 










395 










400 


Asp 


Ala 


Asp 


Gin 


Gin 


Arg 


He 


Lys 


Phe 


He 


Asn 


Met 


Asn 


Gly 


Leu 


Met 










405 










410 










415 




Asp 


Asp 


Pro 


Met 


Lys 


Val 


Tyr 


Lys 


Asp 


Arg 


Gin 


Val 


Thr 


Asn 


Met 


Trp 








420 










425 










430 






Ser 


Glu 


Gin 


Glu 


Arg 


Asp 


Thr 


Phe 


Arg 


Glu 


Lys 


Phe 


Met 


Gin 


His 


Pro 






435 










440 










445 








Lys 


Asn 


Phe 


Gly 


Leu 


He 


Ala 


Ser 


Phe 


Leu 


Glu 


Arg 


Lys 


Thr 


Val 


Ala 




450 










455 










460 










Glu 


Cys 


Val 


Leu 


Tyr 


Tyr 


Tyr 


Leu 


Thr 


Lys 


Lys 


Asn 


Glu 


Asn 


Tyr 


Lys 


465 










470 










475 










480 


Ser 


Leu 


Val 


Arg 


Arg 


Ser 


Tyr 


Arg 


Arg 


Arg 


Gly 


Lys 


Ser 


Gin 


Gin 


Gin 










485 










490 










495 




Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Gin 


Met 


Ala 


Arg 


Ser 


Ser 








500 










505 










510 






Gin 


Glu 


Glu 


Lys 


Glu 


Glu 


Lys 


Glu 


Lys 


Glu 


Lys 


Glu 


Ala 


Asp 


Lys 


Glu 






515 










520 










525 








Glu 


Glu 


Lys 


Gin 


Asp 


Ala 


Glu 


Asn 


Glu 


Lys 


Glu 


Glu 


Leu 


Ser 


Lys 


Glu 




530 










535 










540 










Lys 


Thr 


Asp 


Asp 


Thr 


Ser 


Gly 


Glu 


Asp 


Asn 


His 


Glu 


Lys 


Glu 


Ala 


Val 


545 










550 










555 










560 


Ala 


Ser 


Lys 


Gly 


Arg 


Lys 


Thr 


Ala 


Asn 


Ser 


Gin 


Gly 


Arg 


Arg 


Lys 


Gly 










565 










570 










575 




Arg 


He 


Thr 


Arg 


Ser 


Met 


Ala 


Asn 


Glu 


Ala 


Asn 


His 


Glu 


Glu 


Thr 


Ala 








580 










585 










590 






Thr 


Pro 


Gin 


Gin 


Ser 


Ser 


Glu 


Leu 


Ala 


Ser 


Met 


Glu 


Met 


Asn 


Glu 


Ser 






595 










600 










605 








Ser 


Arg 


Trp 


Thr 


Glu 


Glu 


Glu 


Met 


Glu 


Thr 


Ala 


Lys 


Lys 


Gly 


Leu 


Leu 




610 










615 










620 










Glu 


His 


Gly 


Arg 


Asn 


Trp 


Ser 


Ala 


He 


Ala 


Arg 


Met 


Val 


Gly 


Ser 


Lys 


625 










630 










635 










640 


Thr 


Val 


Ser 


Gin 


Cys 


Lys 


Asn 


Phe 


Tyr 


Phe 


Asn 


Tyr 


Lys 


Lys 


Arg 


Gin 










645 










650 










655 




Asn 


Leu 


Asp 


Glu 


He 


Leu 


Gin 


Gin 


His 


Lys 


Leu 


Lys 


Met 


Glu 


Lys 


Glu 








660 










665 










670 






Arg 


Asn 


Ala 


Arg 


Arg 


Lys 


Lys 


Lys 


Lys 


Thr 


Pro 


Ala 


Ala 


Ala 


Ser 


Glu 






675 










680 










685 








Glu 


Thr 


Ala 


Phe 


Pro 


Pro 


Ala 


Ala 


Glu 


Asp 


Glu 


Glu 


Met 


Glu 


Ala 


Ser 




690 










695 










700 










Gly 


Ala 


Ser 


Ala 


Asn 


Glu 


Glu 


Glu 


Leu 


Ala 


Glu 


Glu 


Ala 


Glu 


Ala 


Ser 


705 










710 










715 










720 


Gin 


Ala 


Ser 


Gly 


Asn 


Glu 


Val 


Pro 


Arg 


Val 


Gly 


Glu 


Cys 


Ser 


Gly Pro 










725 










730 










735 




Ala 


Ala 


Val 


Asn 


Asn 


Ser 


Ser 


Asp 


Thr 


Glu 


Ser 


Val 


Pro 


Ser 


Pro 


Arg 








740 










745 










750 






Ser 


Glu 


Ala 


Met 


Lys 


Asp 


Thr 


Gly 


Pro 


Lys 


Pro 


Thr 


Gly 


Thr 


Glu 


Ala 






755 










760 


















Leu 


Pro 


Ala 


Ala 


Thr 


Gin 


Pro 


Pro 


Val 


Pro 


Pro 


Pro 


Glu 


Glu 


Pro 


Ala 




770 










775 










780 










Val 


Ala 


Pro 


Ala 


Glu 


Pro 


Ser 


Pro 


Val 


Pro 


Asp 


Ala 


Ser 


Gly 


Pro 


Pro 
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785 










790 










795 










800 


Ser 


Pro 


Glu 


Pro 


Ser 
805 


His 


His 


Leu 


Pro 


His 
810 


Pro 


Arg 






Trp 
815 




Arg 


Met 


Asn 


Lys 
820 


Lys 


Pro 


Arg 


Leu 


Leu 
825 


Gin 


Leu 


Pro 


Arg 


Gin 
830 


Arg 




Pro 


Arg 


Ser 
835 


Arg 


Ser 


Leu 


Arg 


Pro 
840 


Arg 


Arg 


Ser 


Met 


Trp 

v'^i 


Glu 


Lys 


Pro 


Glu 


Glu 
850 


Pro 


Glu 


Ala 


Ser 


Glu 
855 


Glu 


Pro 


Pro 


Glu 


860 




Lys 




Asp 


His 


Lys 


Glu 


Glu 


Thr 


Glu 


Glu 


Glu 


Pro 


Glu 


Asp 


Lys 


Ala 


Lys 


Gly 




865 










870 










875 










880 


Glu 


Ala 


lie 


Glu 


Thr 
885 


Val 


Ser 


Glu 


Ala 


Pro 
890 


Leu 


Lys 






895 




Gly 


Ser 


Lys 


Ala 
900 


Ala 


Val 


Thr 


Lys 


Gly 
905 


Ser 


Ser 


Ser 


Gly 


910 






Asp 


Ser 


Asp 
915 


Phe 


Ser 


Ala 


Thr 


Cys 
920 


Ser 


Ala 


Asp 


Glu 


Val 
925 


Asp 


Glu 


Pro 


Glu 


Gly 
930 


Gly 


Asp 


Lys 


Gly 


Arg 
935 


Leu 


Leu 


Ser 


Pro 


Arg 
940 


Pro 


Ser 






Thr 


Pro 


Ala 


Gly 


Asp 


Pro 


Arg 


Ala 


Ser 


Thr 


Ser 


Pro 


Gin 


Lys 


Pro 


Leu 


945 










950 










955 










960 


Asp 


Leu 


Lys 


Gin 


Leu 
965 


Lys 


Gin 


Arg 


Ala 


Ala 
970 


Ala 


He 


Pro 


Pro 


He 
975 


Gin 


Val 


Thr 


Lys 


Val 


His 


Glu 


Pro 


Pro 


Arg 


Glu 


Asp 


Thr 


Val 


Pro 


Pro 


Lys 






980 










985 










990 






Pro 


Val 


Pro 


Pro 


Val 


Pro 


Pro 


Pro 


Thr 


Gin 


His 


Leu 


Gin 


Pro 


Glu 


Gly 






995 










1000 








1005 






Asp 


Val 


Ser 


Gin 


Gin 


Ser 


Gly 


Gly 


Ser 


Pro 


Arg 


Gly 


Lys 


Ser 


Arg 


Ser 




1010 








1015 








1020 








Pro 


Val 


Pro 


Pro 


Ala 


Glu 


Lys 


Glu 


Ala 


Glu 


Lys 


Pro 


Ala 


Phe 


Phe 


Pro 


1025 








1030 








1035 








1040 


Ala 


Phe 


Pro 


Thr 


Glu 


Gly 


Pro 


Lys 


Leu 


Pro 


Thr 


Glu 


Pro 


Pro 


Arg 


Trp 










1045 








1050 








1055 


Ser 


Ser 


Gly 


Leu 


Pro 


Phe 


Pro 


He 


Pro 


Pro 


Arg 


Glu 


Val 


He 


Lys 


Thr 








1060 








1065 








1070 




Ser 


Pro 


His 


Ala 


Ala 


Asp 


Pro 


Ser 


Ala 


Phe 


Ser 


Tyr 


Thr 


Pro 


Pro 


Gly 






1075 








1080 








1085 






His 


Pro 


Leu 


Pro 


Leu 


Gly 


Leu 


His 


Asp 


Ser 


Ala 


Arg 


Pro 


Val 


Leu 


Pro 




1090 








1095 








1100 








Arg 


Pro 


Pro 


lie 


Ser 


Asn 


Pro 


Pro 


Pro 


Leu 


He 


Ser 


Ser 


Ala 


Lys 


His 


1105 








1110 








1115 








1120 


Pro Gly Val Leu Glu Arg Gin Leu Gly Ala 


He 


Ser 


Gin 


Gin Gly Met 










1125 








1130 








1135 


Ser 


Val 


Gin 


Leu 


Arg 


Val 


Pro 


His 


Ser 


Glu 


His 


Ala 


Lys 


Ala 


Pro 


Met 








1140 








1145 








1150 




Gly 


Pro 


Leu 


Thr 


Met 


Gly 


Leu 


Pro 


Leu 


Ala 


Val 


Asp 


Pro 


Lys 


Lys 


Leu 






1155 








1160 








1165 






Gly Thr Ala Leu Gly Ser Ala 


Thr 


Ser Gly Ser 


He 


Thr 


Lys 


Gly 


Leu 




1170 








1175 








1180 








Pro 


Ser 


Thr 


Arg 


Ala 


Ala 


Asp 


Gly 


Pro 


Ser Tyr Arg Gly 


Ser 


He 


Thr 


118i 


5 








1190 








1195 








1200 


His 


Gly 


Thr 


Pro 


Ala 


Asp 


Val 


Leu 


Tyr 


Lys 


Gly 


Thr 


He 


Ser 


Arg 


He 










1205 








1210 








1215 


Val 


Gly 


Glu 


Asp 


Ser 


Pro 


Ser 


Arg 


Leu 


Asp 


Arg 


Ala 


Arg 


Glu 


Asp 


Thr 








1220 








1225 








1230 




Leu 


Pro 


Lys 


Gly 


His 


Val 


lie 


Tyr 


Glu 


Gly 


Lys 


Lys 


Gly 


His 


Val 


Leu 






1235 








1240 








1245 






Ser 


Tyr 


Glu 


Gly 


Gly Met 


Ser 


Val 


Ser 


Gin 


Cys 


Ser 


Lys 


Glu 


Asp 


Gly 




1250 








1255 








1260 








Arg 


Ser 


Ser 


Ser Gly 


Pro 


Pro 


His 


Glu 


Thr 


Ala 


Ala 


Pro 


Lys 


Arg 


Thr 
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1265 1270 1275 1280 

Tyr Asp Met Met Glu Gly Arg Val Gly Arg Thr Val Thr Ser Ala Ser 

1285 1290 1295 

lie Glu Gly Leu Met Gly Arg Ala lie Pro Glu Gin His Ser Pro His 

1300 1305 1310 

Leu Lys Glu Gin His His lie Arg Gly Ser lie Thr Gin Gly lie Pro 

1315 1320 1325 

Arg Ser Tyr Val Glu Ala Gin Glu Asp Tyr Leu Arg Arg Glu Ala Lys 

1330 1335 1340 

Leu Leu Lys Arg Glu Gly Thr Pro Pro Pro Pro Pro Pro Pro Arg Asp 
1345 1350 1355 1360 

Leu Thr Glu Thr Tyr Lys Pro Arg Pro Leu Asp Pro Leu Gly Pro Leu 

1365 1370 1375 

Lys Leu Lys Pro Thr His Glu Gly Val Val Ala Thr Val Lys Glu Ala 

1380 1385 1390 

Gly Arg Ser lie His Glu lie Pro Arg Glu Glu Leu Arg Arg Thr Pro 

1395 1400 1405 

Glu Leu Pro Leu Ala Pro Arg Pro Leu Lys Glu Gly Ser lie Thr Gin 

1410 1415 1420 

Gly Thr Pro Leu Lys Tyr Asp Ser Gly Ala Pro Ser Thr Gly Thr Lys 
1425 1430 1435 1440 

Lys His Asp Val Arg Ser lie He Gly Ser Pro Gly Arg Pro Phe Pro 

1445 1450 1455 

Ala Leu His Pro Leu Asp He Met Ala Asp Ala Arg Ala Leu Glu Arg 

1460 1465 1470 

Ala Cys Tyr Glu Glu Ser Leu Lys Ser Arg Ser Gly Thr Ser Ser Gly 

1475 1480 1485 

Ala Gly Gly Ser He Thr Arg Gly Ala Pro Val Val Val Pro Glu Leu 

1490 1495 1500 

Gly Lys Pro Arg Gin Ser Pro Leu Thr Tyr Glu Asp His Gly Ala Pro 
1505 1510 1515 1520 

Phe Thr Ser His Leu Pro Arg Gly Ser Pro Val Thr Thr Arg Glu Pro 

1525 1530 1535 

Thr Pro Arg Leu Gin Glu Gly Ser Leu Leu Ser Ser Lys Ala Ser Gin 

1540 1545 1550 

Asp Arg Lys Leu Thr Ser Thr Pro Arg Glu He Ala Lys Ser Pro His 

1555 1560 1565 

Ser Thr Val Pro Glu His His Pro His Pro He Ser Pro Tyr Glu His 

1570 1575 1580 

Leu Leu Arg Gly Val Thr Gly Val Asp Leu Tyr Arg Gly His He Pro 
1585 1590 1595 1600 

Leu Ala Phe Asp Pro Thr Ser He Pro Arg Gly He Pro Leu Glu Ala 

1605 1610 1615 

Ala Ala Ala Ala Tyr Tyr Leu Pro Arg His Leu Ala Pro Ser Pro Thr 

1620 1625 1630 

Tyr Pro His Leu Tyr Pro Pro Tyr Leu lie Arg Gly Tyr Pro Asp Thr 

1635 1640 1645 

Ala Ala Leu Glu Asn Arg Gin Thr He He Asn Asp Tyr He Thr Ser 

1650 1655 1660 

Gin Gin Met His His Asn Ala Ala Ser Ala Met Ala Gin Arg Ala Asp 
1665 1670 1675 1680 

Met Leu Arg Gly Leu Ser Pro Arg Glu Ser Ser Leu Ala Leu Asn Tyr 

1685 1690 1695 

Ala Ala Gly Pro Arg Gly He He Asp Leu Ser Gin Val Pro His Leu 

1700 1705 1710 

Pro Val Leu Val Pro Pro Thr Pro Gly Thr Pro Ala Thr Ala He Asp 

1715 1720 1725 

Arg Leu Ala Tyr Leu Pro Thr Ala Pro Pro Pro Phe Ser Ser Arg His 

1730 1735 1740 

Ser Ser Ser Pro Leu Ser Pro Gly Gly Pro Thr His Leu Ala Lys Pro 
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1745 1750 1755 1760 

Thr Ala Thr Ser Ser Ser Glu Arg Glu Arg Glu Arg Glu Arg Glu Arg 

1765 1770 1775 

Asp Lys Ser lie Leu Thr Ser Thr Thr Thr Val Glu His Ala Pro lie 

1780 1785 1790 

Trp Arg Pro Gly Thr Glu Gin Ser Ser Gly Ala Gly Gly Ser Ser Arg 

1795 1800 1805 

Pro Ala Ser His Thr His Gin His Ser Pro lie Ser Pro Arg Thr Gin 

1810 1815 1820 

Asp Ala Leu Gin Gin Arg Pro Ser Val Leu His Asn Thr Ser Met Lys 
1825 1830 1835 1840 

Gly Val Val Thr Ser Val Glu Pro Gly Thr Pro Thr Val Leu Arg Trp 

1845 1850 1855 

Ala Arg Ser Thr Ser Thr Ser Ser Pro Val Arg Pro Ala Ala Thr Phe 

1860 1865 1870 

Pro Pro Ala Thr His Cys Pro Leu Gly Gly Thr Leu Glu Gly Val Tyr 

1875 1880 1885 

Pro Thr Leu Met Glu Pro Val Leu Leu Pro Lys Glu Thr Ser Arg Val 

1890 1895 1900 

Ala Arg Pro Glu Arg Ala Arg Val Asp Ala Gly His Ala Phe Leu Thr 
1905 1910 1915 1920 

Lys Pro Pro Gly Arg Glu Pro Ala Ser Ser Pro Ser Lys Ser Ser Glu 

1925 1930 1935 

Pro Arg Ser Leu Ala Pro Pro Ser Ser Ser His Thr Ala lie Ala Arg 

1940 1945 1950 

Thr Pro Ala Lys Asn Leu Ala Pro His His Ala Ser Pro Asp Pro Pro 

1955 1960 1965 

Ala Pro Thr Ser Ala Ser Asp Leu His Arg Glu Lys Thr Gin Ser Lys 

1970 1975 1980 

Pro Phe Ser lie Gin Glu Leu Glu Leu Arg Ser Leu Gly Tyr His Ser 
1985 1990 1995 2000 

Gly Ala Gly Tyr Ser Pro Asp Gly Val Glu Pro lie Ser Pro Val Ser 

2005 2010 2015 

Ser Pro Ser Leu Thr His Asp Lys Gly Leu Ser Lys Pro Leu Glu Glu 

2020 2025 2030 

Leu Glu Lys Ser His Leu Glu Gly Glu Leu Arg His Lys Gin Pro Gly 

2035 2040 2045 

Pro Met Lys Leu Ser Ala Glu Ala Ala His Leu Pro His Leu Arg Pro 

2050 2055 2060 

Leu Pro Glu Ser Gin Pro Ser Ser Ser Pro Leu Leu Gin Thr Ala Pro 
2065 2070 2075 2080 

Gly lie Lys Gly His Gin Arg Val Val Thr Leu Ala Gin His lie Ser 

2085 2090 2095 

Glu Val lie Thr Gin Asp Tyr Thr Arg His His Pro Gin Gin Leu Ser 

2100 2105 2110 

Gly Pro Leu Pro Ala Pro Leu Tyr Ser Phe Pro Gly Ala Ser Cys Pro 

2115 2120 2125 

Val Leu Asp Leu Arg Arg Pro Pro Ser Asp Leu Tyr Leu Pro Pro Pro 

2130 2135 2140 

Asp His Gly Thr Pro Ala Arg Gly Ser Pro His Ser Glu Gly Gly Lys 
2145 2150 2155 2160 

Arg Ser Pro Glu Pro Ser Lys Thr Ser Val Leu Gly Ser Ser Glu Asp 

2165 2170 2175 

Ala lie Glu Pro Val Ser Pro Pro Glu Gly Met Thr Glu Pro Gly His 

2180 2185 2190 

Ala Arg Ser Thr Ala Tyr Pro Leu Leu Tyr Arg Asp Gly Glu Gin Gly 

2195 2200 2205 

Glu Pro Arg Met Gly Leu Glu Ser Pro Gly Asn Thr Ser Gin Pro Pro 

2210 2215 2220 

Thr Phe Phe Ser Lys Leu Thr Glu Ser Asn Ser Ala Met Val Lys Ser 
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2225 2230 2235 2240 

Lys Lys Gin Glu He Asn Lys Lys Leu Asn Thr His Asn Arg Asn Glu 

2245 2250 2255 

Pro Glu Tyr Asn He Gly Gin Pro Gly Thr Glu He Phe Asn Met Pro 

2260 2265 2270 

Ala He Thr Gly Ala Gly Leu Met Thr Cys Arg Ser Gin Ala Val Gin 

2275 2280 2285 

Glu His Ala Ser Thr Asn Met Gly Leu Glu Ala He He Arg Lys Ala 

2290 2295 2300 

Leu Met Gly Lys Tyr Asp Gin Trp Glu Glu Pro Pro Pro Leu Gly Ala 
2305 2310 2315 2320 

Asn Ala Phe Asn Pro Leu Asn Ala Ser Ala Ser Leu Pro Ala Ala Ala 

2325 2330 2335 

Met Pro He Thr Thr Ala Asp Gly Arg Ser Asp His Ala Leu Thr Ser 

2340 2345 2350 

Pro Gly Gly Gly Gly Lys Ala Lys Val Ser Gly Arg Pro Ser Ser Arg 

2355 2360 2365 

Lys Ala Lys Ser Pro Ala Pro Gly Leu Ala Ser Gly Asp Arg Pro Pro 

2370 2375 2380 

Ser Val Ser Ser Val His Ser Glu Gly Asp Cys Asn Arg Arg Thr Pro 
2385 2390 2395 2400 

Leu Thr Asn Arg Val Trp Glu Asp Arg Pro Ser Ser Ala Gly Ser Thr 

2405 2410 2415 

Pro Phe Pro Tyr Asn Pro Leu He Met Arg Leu Gin Ala Gly Val Met 

2420 2425 2430 

Ala Ser Pro Pro Pro Pro Gly Leu Ala Ala Gly Ser Gly Pro Leu Ala 

2435 2440 2445 

Gly Pro His His Ala Trp Asp Glu Glu Pro Lys Pro Leu Leu Cys Ser 

2450 2455 2460 

Gin Tyr Glu Thr Leu Ser Asp Ser Glu 
2465 2470 

<210> 8 

<211> 7455 

<212> DNA 

<213> Mus musculus 

<220> 
<221> CDS 

<222> (363) . . . (7124) 

<221> misc_f eature 
<222> (1) . . . (7465) 
<223> n = A,T,C or G 

<400> 8 

ggcacgaggg cagcgcaggc cgggccgcat ccccgtcccc gcgccagccg cccgcgcccg 60 

ccatgcgcgc cccgcagcgg cccgcgcgtc cgggccccgc gtcgtagcgc ggcgggcgga 12 0 

gaccgcaggc tctcagcccg gacccgccgc atcctcgagc ccgatcggcg ccgtagcccg 18 0 

gcgccagcgc ccggtgccgc cgccggcgag tgctcctgag tctttgagga acacagcctc 24 0 

ctggtggaag ttcgtggcac ctgtgacgag gtcacctgcc agcagatgac cgagaccagc 3 00 

ccttagtcct aggtgtggtc aagagtgtct tgggctccaa agcctacctg gaccctacca 360 

cc atg tea gga tec aca cag cct gtg gca cag aca tgg egg get get 407 
Met Ser Gly Ser Thr Gin Pro Val Ala Gin Thr Trp Arg Ala Ala 
15 10 15 



gag cec cgc tac eca ccc eat ggc ate tec tac ecg gtg cag ata gcc 
Glu Pro Arg Tyr Pro Pro His Gly He Ser Tyr Pro Val Gin He Ala 
20 25 30 



455 
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egg tec cac acg cct ctg tac aac cag ccg tct gac aca cgc cag tac 
Arg Ser His Thr Pro Leu Tyr Asn Gin Pro Ser Asp Thr Arg Gin Tyr 



cat gaa aac ate aaa ata aac cag gcg atg egg aag aag ctg ate ttg 
His Glu Asn lie Lys lie Asn Gin Ala Met Arg Lys Lys Leu lie Leu 



tae ttt aag egg agg aac cac gcg cgc aag cag tgg gaa cag cgc ttc 
Tyr Phe Lys Arg Arg Asn His Ala Arg Lys Gin Trp Glu Gin Arg Phe 



tgc cag ege tat gac cag etc atg gag gcg tgg gag aag aag gta gag 
Cys Gin Arg Tyr Asp Gin Leu Met Glu Ala Trp Glu Lys Lys Val Glu 



cgc ata gag aac aat ccg ega agg agg gee aag gag age aag gtg agg 
Arg lie Glu Asn Asn Pro Arg Arg Arg Ala Lys Glu Ser Lys Val Arg 
100 105 110 

gag tae tae gag aaa cag ttc ccg gag ate cgc aag cag egg gag ctg 
Glu Tyr Tyr Glu Lys Gin Phe Pro Glu lie Arg Lys Gin Arg Glu Leu 
115 120 125 

cag gag cgc atg eag age agg gtg ggc cag egt gge agt ggg etc tec 
Gin Glu Arg Met Gin Ser Arg Val Gly Gin Arg Gly Ser Gly Leu Ser 
130 135 140 

atg teg get gcc cgc agt gag cat gag gtt tct gag ate att gat ggc 
Met Ser Ala Ala Arg Ser Glu His Glu Val Ser Glu lie lie Asp Gly 
145 150 155 

ttg tct gag cag gag aac ctg gag aag cag atg ege cag ctg gee gtg 
Leu Ser Glu Gin Glu Asn Leu Glu Lys Gin Met Arg Gin Leu Ala Val 
160 165 170 175 

ate ccg ccc atg ttg tac gac gcg gac cag cag agg ate aag ttc ate 
lie Pro Pro Met Leu Tyr Asp Ala Asp Gin Gin Arg lie Lys Phe lie 
180 185 190 

aac atg aat gga etc atg gat gac ccc atg aag gte tae aag gac egt 
Asn Met Asn Gly Leu Met Asp Asp Pro Met Lys Val Tyr Lys Asp Arg 
195 200 205 

cag gtt acc aac atg tgg age gag cag gag agg gac acc ttc egt gag 
Gin Val Thr Asn Met Trp Ser Glu Gin Glu Arg Asp Thr Phe Arg Glu 
210 215 220 

aag ttt atg eag cac cct aag aac ttt gge ctg att gcc tea ttc etg 
Lys Phe Met Gin His Pro Lys Asn Phe Gly Leu lie Ala Ser Phe Leu 
225 230 235 

gag aga aag acg gtc get gag tgt gtc etc tat tac tac ctg acc aag 
Glu Arg Lys Thr Val Ala Glu Cys Val Leu Tyr Tyr Tyr Leu Thr Lys 
240 245 250 255 

aag aat gaa aat tae aag age ttg gtg agg egg age tat egg ege egt 
Lys Asn Glu Asn Tyr Lys Ser Leu Val Arg Arg Ser Tyr Arg Arg Arg 
260 265 270 
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ggc aag age cag cag cag cag cag cag caa caa cag cag cag cag cag 
Gly Lys Ser Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin Gin 
275 280 285 

cag atg gca egg age age eag gag gag aag gag gag aag gag aag gag 
Gin Met Ala Arg Ser Ser Gin Glu Glu Lys Glu Glu Lys Glu Lys Glu 
290 295 300 

aag gag gcc gac aag gag gaa gag aag cag gat gcg gag aac gag aag 
Lys Glu Ala Asp Lys Glu Glu Glu Lys Gin Asp Ala Glu Asn Glu Lys 
305 310 315 

gaa gaa etc age aag gag aag aca gac gac act tet ggc gag gac aac 
Glu Glu Leu Ser Lys Glu Lys Thr Asp Asp Thr Ser Gly Glu Asp Asn 
320 325 330 335 

gat gag aaa gag gcc gtg gcc tec aaa ggc cgc aaa act gcc aac age 
Asp Glu Lys Glu Ala Val Ala Ser Lys Gly Arg Lys Thr Ala Asn Ser 
340 345 350 

caa ggc cgc cgc aaa ggc cgt ate acg cgc tec atg gcc aac gag gcc 
Gin Gly Arg Arg Lys Gly Arg lie Thr Arg Ser Met Ala Asn Glu Ala 
355 360 365 

aac eat gag gag aca gee ace cca cag caa agt tea gag etg get tec 
Asn His Glu Glu Thr Ala Thr Pro Gin Gin Ser Ser Glu Leu Ala Ser 
370 375 380 

atg gag atg aac gag agt tct cgc tgg act gag gaa gag atg gag aca 
Met Glu Met Asn Glu Ser Ser Arg Trp Thr Glu Glu Glu Met Glu Thr 
385 390 395 

gca aag aaa ggc etc ctg gaa cat ggg agg aac tgg tea gcc att gee 
Ala Lys Lys Gly Leu Leu Glu His Gly Arg Asn Trp Ser Ala lie Ala 
400 405 410 415 

cgc atg gtg ggc tee aag ace gtg tee eag tgt aag aac tte tac ttc 
Arg Met Val Gly Ser Lys Thr Val Ser Gin Cys Lys Asn Phe Tyr Phe 
420 425 430 

aac tac aag aag agg cag aac ctg gac gaa ate ctt cag cag cac aag 
Asn Tyr Lys Lys Arg Gin Asn Leu Asp Glu lie Leu Gin Gin His Lys 
435 440 445 

eta aag atg gag aag gag agg aac get egg agg aag aag aag aag acc 
Leu Lys Met Glu Lys Glu Arg Asn Ala Arg Arg Lys Lys Lys Lys Thr 
450 455 460 

cca get gcg gcg age gag gag aca gcc ttc eea cet gcc get gag gac 
Pro Ala Ala Ala Ser Glu Glu Thr Ala Phe Pro Pro Ala Ala Glu Asp 
465 470 475 

gaa gag atg gaa gca tea ggc gca agt gcc aat gag gaa gag ctg gcg 
Glu Glu Met Glu Ala Ser Gly Ala Ser Ala Asn Glu Glu Glu Leu Ala 
480 485 490 495 

gag gag gca gaa gcc tea cag gcc tct ggg aat gag gtt cec aga gtt 
Glu Glu Ala Glu Ala Ser Gin Ala Ser Gly Asn Glu Val Pro Arg Val 
500 505 510 
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ggg gag tgc agt ggc cca get get gtc aac aac age tct gat aet gag 
Gly Glu Cys Ser Gly Pro Ala Ala Val Asn Asn Ser Ser Asp Thr Glu 
515 520 525 

agt gtc cca tec ccg cgt tea gaa gee acg aag gac act ggg cct aaa 
Ser Val Pro Ser Pro Arg Ser Glu Ala Thr Lys Asp Thr Gly Pro Lys 
530 535 540 

cec act gge aet gaa gea ttg cec get gee aee eag cca ect gtt eet 
Pro Thr Gly Thr Glu Ala Leu Pro Ala Ala Thr Gin Pro Pro Val Pro 
545 550 555 

cct cca gaa gaa ecg gca gta gee eet get gag ecc tec cca gtc cct 
Pro Pro Glu Glu Pro Ala Val Ala Pro Ala Glu Pro Ser Pro Val Pro 
560 565 570 575 

gat gee agt gge cca cca tec cca gag eet tec cat cac ctg ecg eac 
Asp Ala Ser Gly Pro Pro Ser Pro Glu Pro Ser His His Leu Pro His 
580 585 590 

cec egg eta ctg tgg aca agg atg aac aag aag cec egg ctg etc eag 
Pro Arg Leu Leu Trp Thr Arg Met Asn Lys Lys Pro Arg Leu Leu Gin 
595 600 605 

etc ecc aga eag agg atg cca agg age aga agt ctg agg ccg agg aga 
Leu Pro Arg Gin Arg Met Pro Arg Ser Arg Ser Leu Arg Pro Arg Arg 
610 615 620 

teg atg tgg gaa aag cca gag gag ecc gag get tct gag aag cec ccg 
Ser Met Trp Glu Lys Pro Glu Glu Pro Glu Ala Ser Glu Lys Pro Pro 
625 630 635 

aag agt gta aag agt gac eac aag aag gag acc gag gaa gag cct gaa 
Lys Ser Val Lys Ser Asp His Lys Lys Glu Thr Glu Glu Glu Pro Glu 
640 645 650 655 

gac aaa gcc aag ggc aca gag gcc att gaa act gtg tct gag gca cca 
Asp Lys Ala Lys Gly Thr Glu Ala lie Glu Thr Val Ser Glu Ala Pro 
660 665 670 

ctt aag gtg gag aag get ggt age aag gca get gtg ace aag ggt tee 
Leu Lys Val Glu Lys Ala Gly Ser Lys Ala Ala Val Thr Lys Gly Ser 
675 680 685 

age tea ggt gcc acc cag gac agt gac tec agt gcc acc tgc agt gcc 
Ser Ser Gly Ala Thr Gin Asp Ser Asp Ser Ser Ala Thr Cys Ser Ala 
690 695 700 

gat gag gtg gac gaa cec gaa gga ggt gac aag ggc agg ctg ctg tea 
Asp Glu Val Asp Glu Pro Glu Gly Gly Asp Lys Gly Arg Leu Leu Ser 
705 710 715 

cca agg cec age etc etc acc ccg get gga gat cec egg gcc agt acc 
Pro Arg Pro Ser Leu Leu Thr Pro Ala Gly Asp Pro Arg Ala Ser Thr 
720 725 730 735 

teg cec eag aag ecg ctg gac ctg aag cag ctg aag cag cga gca gcc 
Ser Pro Gin Lys Pro Leu Asp Leu Lys Gin Leu Lys Gin Arg Ala Ala 
740 745 750 
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gcc ate ccc cct ate gtc acc aag gtc cat gag ccc ccc egg gag gac 

Ala lie Pro Pro lie Val Thr Lys Val His Glu Pro Pro Arg Glu Asp 
755 760 765 

aca gta ccc cca aag cca gtt ccc cct gtg cct cca ccc acg cag cac 
Thr Val Pro Pro Lys Pro Val Pro Pro Val Pro Pro Pro Thr Gin His 
770 775 780 

eta cag cca gag ggt gac gtg tet cag cag teg gga gga agt cca cgt 
Leu Gin Pro Glu Gly Asp Val Ser Gin Gin Ser Gly Gly Ser Pro Arg 
785 790 795 

gge aag tec cgc age cca gtg cct cct gcc gag aaa gag gca gag aaa 
Gly Lys Ser Arg Ser Pro Val Pro Pro Ala Glu Lys Glu Ala Glu Lys 
800 805 810 815 

ccc gca ttc ttt ccg get ttc cca act gag ggc cca aag eta ccg act 
Pro Ala Phe Phe Pro Ala Phe Pro Thr Glu Gly Pro Lys Leu Pro Thr 
820 825 830 

gag ccc cca cgc tgg tea teg gge etg ccc ttc ccc ate cct cca egg 
Glu Pro Pro Arg Trp Ser Ser Gly Leu Pro Phe Pro lie Pro Pro Arg 
835 840 845 

gag gtg ate aag act tec cca cac gcc get gac ccc tct gcc ttc tec 
Glu Val lie Lys Thr Ser Pro His Ala Ala Asp Pro Ser Ala Phe Ser 
850 855 860 

tac aca ccc ccc ggt cac ccg ctg cct etg ggc etc cac gat agt gee 
Tyr Thr Pro Pro Gly His Pro Leu Pro Leu Gly Leu His Asp Ser Ala 
865 870 875 

egg ccc gtc ctg cca cgt ccc ccc ate tct aac ccc cca ccc etc ate 
Arg Pro Val Leu Pro Arg Pro Pro lie Ser Asn Pro Pro Pro Leu lie 
880 885 890 895 

tec tct gcc aag cat ccc ggc gta ctt gag agg cag ctg ggt gcc ate 
Ser Ser Ala Lys His Pro Gly Val Leu Glu Arg Gin Leu Gly Ala lie 
900 905 910 

tec cag cag ggg atg tea gtc cag ctt cgt gtg cct cac tea gag cat 
Ser Gin Gin Gly Met Ser Val Gin Leu Arg Val Pro His Ser Glu His 
915 920 925 

gee aag gcc ccc atg gge cct etc acc atg ggg etg cee ctt gcc gtg 
Ala Lys Ala Pro Met Gly Pro Leu Thr Met Gly Leu Pro Leu Ala Val 
930 935 940 

gac cct aag aag etg ggg aca gca ctg gge tec gcc acc agt gga age 
Asp Pro Lys Lys Leu Gly Thr Ala Leu Gly Ser Ala Thr Ser Gly Ser 
945 950 955 

ate ace aag ggc etc cee agt ace egg get gca gac gge ccc age tac 
lie Thr Lys Gly Leu Pro Ser Thr Arg Ala Ala Asp Gly Pro Ser Tyr 
960 965 970 975 

aga gge tct ate ace cac ggc acg ccc gca gac gtc etc tac aag ggt 
Arg Gly Ser lie Thr His Gly Thr Pro Ala Asp Val Leu Tyr Lys Gly 
980 985 990 
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acc ate age agg ate gtc ggt gag gac age eea agt egc ctt gae egg 

Thr lie Ser Arg lie Val Gly Glu Asp Ser Pro Ser Arg Leu Asp Arg 
995 1000 1005 

gca cga gag gac acc ctg ccc aag ggc cat gtc ate tat gag ggc aag 

Ala Arg Glu Asp Thr Leu Pro Lys Gly His Val lie Tyr Glu Gly Lys 
1010 1015 1020 

aaa ggc cac gtc eta tec tat gaa ggt ggt atg tec gtg tea cag tgc 

Lys Gly His Val Leu Ser Tyr Glu Gly Gly Met Ser Val Ser Gin Cys 
1025 1030 1035 

tet aag gag gat gga agg age age teg ggc eea ccc cat gag act gee 

Ser Lys Glu Asp Gly Arg Ser Ser Ser Gly Pro Pro His Glu Thr Ala 

1040 1045 1050 1055 

gee ect aaa egc ace tat gae atg atg gag ggc cgt gta ggc agg act 

Ala Pro Lys Arg Thr Tyr Asp Met Met Glu Gly Arg Val Gly Arg Thr 
1060 1065 1070 

gtc acc tea gee age ata gag gga etc atg ggc egc gee ate ect gag 

Val Thr Ser Ala Ser lie Glu Gly Leu Met Gly Arg Ala lie Pro Glu 
1075 1080 1085 

cag cac age ccc cac etc aag gag cag cat cac ate cga ggc tec ate 

Gin His Ser Pro His Leu Lys Glu Gin His His lie Arg Gly Ser lie 
1090 1095 1100 

acg caa ggc ate eeg agg tec tat gtg gag gcg cag gag gac tac tta 

Thr Gin Gly lie Pro Arg Ser Tyr Val Glu Ala Gin Glu Asp Tyr Leu 
1105 1110 1115 

egg egg gag gee aag etc ttg aag cga gaa ggg aca cea eea ccc eea 

Arg Arg Glu Ala Lys Leu Leu Lys Arg Glu Gly Thr Pro Pro Pro Pro 

1120 1125 1130 1135 

cea eea ect egg gae ctg act gag ace tac aag ccc egg ccc ctg gae 

Pro Pro Pro Arg Asp Leu Thr Glu Thr Tyr Lys Pro Arg Pro Leu Asp 
1140 1145 1150 

ect ctg ggt ccc ctg aag ctg aag eeg act cac gag ggt gtg gta gca 

Pro Leu Gly Pro Leu Lys Leu Lys Pro Thr His Glu Gly Val Val Ala 
1155 1160 1165 

act gtg aag gag gcg ggc egc tct ate cat gag ate ccg aga gag gag 

Thr Val Lys Glu Ala Gly Arg Ser lie His Glu lie Pro Arg Glu Glu 
1170 1175 1180 

ctg egc egc aca ect gag eta ccc ctg gea cea egg ect ctg aag gag 

Leu Arg Arg Thr Pro Glu Leu Pro Leu Ala Pro Arg Pro Leu Lys Glu 
1185 1190 1195 

ggt tec ate acc cag ggc acc cea etc aag tac gac tct ggg gca ccc 

Gly Ser lie Thr Gin Gly Thr Pro Leu Lys Tyr Asp Ser Gly Ala Pro 

1200 1205 1210 1215 

tee act ggc ace aag aaa cac gac gtg egc tec ate ate ggc age ccc 
Ser Thr Gly Thr Lys Lys His Asp Val Arg Ser lie lie Gly Ser Pro 
1220 1225 1230 
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ggc egg cct ttc cct gcc ctg cac ccg ctg gac ata atg get gac gcc 
Gly Arg Pro Phe Pro Ala Leu His Pro Leu Asp lie Met Ala Asp Ala 
1235 1240 1245 

egg gca ctg gag cgt gcc tgc tat gaa gag agt ctg aag age egg tea 
Arg Ala Leu Glu Arg Ala Cys Tyr Glu Glu Ser Leu Lys Ser Arg Ser 
1250 1255 1260 

ggg acc age agt ggt gca ggg ggc tec ate aca cgt ggg get cca gtc 
Gly Thr Ser Ser Gly Ala Gly Gly Ser lie Thr Arg Gly Ala Pro Val 
1265 1270 1275 

gtc gtg cct gaa ctg ggc aag cca egg caa age eca ctg act tae gaa 
Val Val Pro Glu Leu Gly Lys Pro Arg Gin Ser Pro Leu Thr Tyr Glu 
1280 1285 1290 1295 

gac cac ggg gca ccc ttc ace agt cac ctg eca cgt ggc tee eet gtg 
Asp His Gly Ala Pro Phe Thr Ser His Leu Pro Arg Gly Ser Pro Val 
1300 1305 1310 

acc acg agg gag eee aeg cca ege ett cag gaa ggc age etc eta tec 
Thr Thr Arg Glu Pro Thr Pro Arg Leu Gin Glu Gly Ser Leu Leu Ser 
1315 1320 1325 

age aag geg tee cag gac egg aag ctg aca tct aca ccc egg gag ate 
Ser Lys Ala Ser Gin Asp Arg Lys Leu Thr Ser Thr Pro Arg Glu lie 
1330 1335 1340 

gcc aag tec cca cac age act gtg ccc gag cac cac cct cac ccc ate 
Ala Lys Ser Pro His Ser Thr Val Pro Glu His His Pro His Pro lie 
1345 1350 1355 

tee ccc tat gag eac ttg etc egg ggc gtg act ggt gtg gac ctg tae 
Ser Pro Tyr Glu His Leu Leu Arg Gly Val Thr Gly Val Asp Leu Tyr 
1360 1365 1370 1375 

cgt ggt eac ate cca ttg gee ttt gac ccc ace tee ata ccc ega ggg 
Arg Gly His lie Pro Leu Ala Phe Asp Pro Thr Ser lie Pro Arg Gly 
1380 1385 1390 

ate cct ctg gaa gca gca gcc gca gee tac tae ctg ccc egg eac ttg 
lie Pro Leu Glu Ala Ala Ala Ala Ala Tyr Tyr Leu Pro Arg His Leu 
1395 1400 1405 

gee ccc age ccc acc tae cca eac ctg tae eca cct tac etc ate cgc 
Ala Pro Ser Pro Thr Tyr Pro His Leu Tyr Pro Pro Tyr Leu lie Arg 
1410 1415 1420 

ggc tac cct gac acg geg gcc ctg gag aac cgc cag ace ate ate aat 
Gly Tyr Pro Asp Thr Ala Ala Leu Glu Asn Arg Gin Thr lie lie Asn 
1425 1430 1435 

gac tac ate acc teg cag cag atg cac cac aac get gcc tee gcc atg 
Asp Tyr lie Thr Ser Gin Gin Met His His Asn Ala Ala Ser Ala Met 
1440 1445 1450 1455 

gcc cag cgt get gac atg ctg agg ggt ctg tea ccg cga gag tec teg 
Ala Gin Arg Ala Asp Met Leu Arg Gly Leu Ser Pro Arg Glu Ser Ser 
1460 1465 1470 
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ctg gcc etc aat tat gcc get ggc cca aga ggc att ate gac ctg tee 
Leu Ala Leu Asn Tyr Ala Ala Gly Pro Arg Gly lie lie Asp Leu Ser 
1475 1480 1485 

caa gtg cca cac ctg ccc gtg ctg gtg cca cca acg cca ggc acc cct 
Gin Val Pro His Leu Pro Val Leu Val Pro Pro Thr Pro Gly Thr Pro 
1490 1495 1500 

gcc acc gcc ate gac cgc ctt gcc tac etc ccc act gcg ccc cca ccc 
Ala Thr Ala lie Asp Arg Leu Ala Tyr Leu Pro Thr Ala Pro Pro Pro 
1505 1510 1515 

ttc age age cgc cac agt age tea eeg ctg tec cca gga ggc ccc act 
Phe Ser Ser Arg His Ser Ser Ser Pro Leu Ser Pro Gly Gly Pro Thr 
1520 1525 1530 1535 

cac eta get aaa cca act gcc aca tet tea teg gag egg gaa egg gaa 
His Leu Ala Lys Pro Thr Ala Thr Ser Ser Ser Glu Arg Glu Arg Glu 
1540 1545 1550 

cgt gag egg gaa ega gac aag tee ate etc aeg tct ace act aca gtg 
Arg Glu Arg Glu Arg Asp Lys Ser lie Leu Thr Ser Thr Thr Thr Val 
1555 1560 1565 

gag eat gca eee ate tgg aga eet ggt aeg gag cag age age ggg get 
Glu His Ala Pro lie Trp Arg Pro Gly Thr Glu Gin Ser Ser Gly Ala 
1570 1575 1580 

ggg ggc age age egc ccc gcc tee cac ace cac eag cac teg ccc ate 
Gly Gly Ser Ser Arg Pro Ala Ser His Thr His Gin His Ser Pro lie 
1585 1590 1595 

tee ccc egg acc cag gac gee ttg cag cag agg eee agt gtg ctg cac 
Ser Pro Arg Thr Gin Asp Ala Leu Gin Gin Arg Pro Ser Val Leu His 
1600 1605 1610 1615 

aac acg age atg aag ggc gtg gte acc tec gtg gaa ccc ggc acg ccc 
Asn Thr Ser Met Lys Gly Val Val Thr Ser Val Glu Pro Gly Thr Pro 
1620 1625 1630 

acg gtc ctg agg tgg gcc agg tec acc tec ace tct teg cct gtc cgc 
Thr Val Leu Arg Trp Ala Arg Ser Thr Ser Thr Ser Ser Pro Val Arg 
1635 1640 1645 

cca get gcc aca ttc cca cct gee acc cac tgc cca ctt ggt ggc ace 
Pro Ala Ala Thr Phe Pro Pro Ala Thr His Cys Pro Leu Gly Gly Thr 
1650 1655 1660 

ctt gaa ggg gtc tac eet acc etc atg gag ccc gtc ctg tta eee aag 
Leu Glu Gly Val Tyr Pro Thr Leu Met Glu Pro Val Leu Leu Pro Lys 
1665 1670 1675 

gag acc tet egg gtc gcc egg ccc gag egg gcc egg gtg gac get ggc 
Glu Thr Ser Arg Val Ala Arg Pro Glu Arg Ala Arg Val Asp Ala Gly 
1680 1685 1690 1695 

cat gcc ttt ctt acc aaa eee eeg ggc egg gag ccc gee tec tea ccc 
His Ala Phe Leu Thr Lys Pro Pro Gly Arg Glu Pro Ala Ser Ser Pro 
1700 1705 1710 
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age aag age tec gag ccc cga tec eta gca cec ccc age tec age cac 

Ser Lys Ser Ser Glu Pro Arg Ser Leu Ala Pro Pro Ser Ser Ser His 
1715 1720 1725 

aca gcc ate gcc cgc acc cca gca aag aac ctt gca ccc cac cat gcc 
Thr Ala lie Ala Arg Thr Pro Ala Lys Asn Leu Ala Pro His His Ala 
1730 1735 1740 

agt ecg gae ccg ccg geg ece ace teg gee tea gat ctg cac ega gaa 
Ser Pro Asp Pro Pro Ala Pro Thr Ser Ala Ser Asp Leu His Arg Glu 
1745 1750 1755 

aag act caa agt aaa cee ttt tec ate eag gaa ttg gaa etc egt tet 
Lys Thr Gin Ser Lys Pro Phe Ser lie Gin Glu Leu Glu Leu Arg Ser 
1760 1765 1770 1775 

ctg ggt tac cac agt gga get gge tac age ccc gat ggg gtg gag ccc 
Leu Gly Tyr His Ser Gly Ala Gly Tyr Ser Pro Asp Gly Val Glu Pro 
1780 1785 1790 

ate age ccg gtg age tec ccc age ctg ace cac gac aag ggg etc tec 
lie Ser Pro Val Ser Ser Pro Ser Leu Thr His Asp Lys Gly Leu Ser 
1795 1800 1805 

aaa cet ctg gaa gag eta gag aag age cae ttg gaa ggg gag ctg egg 
Lys Pro Leu Glu Glu Leu Glu Lys Ser His Leu Glu Gly Glu Leu Arg 
1810 1815 1820 

cae aag eag cca gge cec atg aag etc age geg gag get gee cat etc 
His Lys Gin Pro Gly Pro Met Lys Leu Ser Ala Glu Ala Ala His Leu 
1825 1830 1835 

cea cat ctg egg eea ctg cec gag age eag cee tea tee age cca etc 
Pro His Leu Arg Pro Leu Pro Glu Ser Gin Pro Ser Ser Ser Pro Leu 
1840 1845 1850 1855 

etc cag act gcc cca gge ate aaa ggt cac cag agg gtg gtc acc ctg 
Leu Gin Thr Ala Pro Gly lie Lys Gly His Gin Arg Val Val Thr Leu 
1850 1865 1870 

get cag cac ate age gag gtc att aeg eag gac tac aeg cgc cac cac 
Ala Gin His lie Ser Glu Val lie Thr Gin Asp Tyr Thr Arg His His 
1875 1880 1885 

ccg cag cag etc agt gge ccc ctt ccc gee cet etc tac tec ttt ccc 
Pro Gin Gin Leu Ser Gly Pro Leu Pro Ala Pro Leu Tyr Ser Phe Pro 
1890 1895 1900 

gga gcc age tgc cet gtc ctg gat ctt cgc cgc cca ccc agt gac etc 
Gly Ala Ser Cys Pro Val Leu Asp Leu Arg Arg Pro Pro Ser Asp Leu 
1905 1910 1915 

tac etc cca cee ccc gae cat gge ace cca gee egg gga tec ccc cac 
Tyr Leu Pro Pro Pro Asp His Gly Thr Pro Ala Arg Gly Ser Pro His 
1920 1925 1930 1935 

agt gaa ggg gge aaa agg tec cea gaa ece age aaa aca teg gtc ctg 
Ser Glu Gly Gly Lys Arg Ser Pro Glu Pro Ser Lys Thr Ser Val Leu 
1940 1945 1950 
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ggc age age gag gat gcc att gag cct gtg tec cca cca gag ggc atg 
Gly Ser Ser Glu Asp Ala lie Glu Pro Val Ser Pro Pro Glu Gly Met 
1955 1960 1965 

act gag cca gga cat get egg age act gcg tac cca ctg ctg tat cga 
Thr Glu Pro Gly His Ala Arg Ser Thr Ala Tyr Pro Leu Leu Tyr Arg 
1970 1975 1980 

gac ggg gaa cag ggc gag ccc agg atg ggt eta gag tct eea ggc aae 
Asp Gly Glu Gin Gly Glu Pro Arg Met Gly Leu Glu Ser Pro Gly Asn 
1985 1990 1995 

acc age cag eeg cca ace tte ttc agt aag ctg act gag age aae tec 
Thr Ser Gin Pro Pro Thr Phe Phe Ser Lys Leu Thr Glu Ser Asn Ser 
2000 2005 2010 2015 

gcc atg gtg aag teg aag aag cag gag ate aae aag aaa etc aae acc 
Ala Met Val Lys Ser Lys Lys Gin Glu lie Asn Lys Lys Leu Asn Thr 
2020 2025 2030 

cac aae egg aac gag eea gaa tac aat att ggc cag cct ggg acg gaa 
His Asn Arg Asn Glu Pro Glu Tyr Asn lie Gly Gin Pro Gly Thr Glu 
2035 2040 2045 

ate tte aae atg ccc gee ate act gga gca ggc ett atg acc tgt aga 
lie Phe Asn Met Pro Ala lie Thr Gly Ala Gly Leu Met Thr Cys Arg 
2050 2055 2060 

age cag gcg gtg caa gaa cac gcc age acc aac atg ggg eta gag gcc 
Ser Gin Ala Val Gin Glu His Ala Ser Thr Asn Met Gly Leu Glu Ala 
2065 2070 2075 

att att aga aag gca etc atg ggt aaa tat gat cag tgg gaa gag ccc 
lie lie Arg Lys Ala Leu Met Gly Lys Tyr Asp Gin Trp Glu Glu Pro 
2080 2085 2090 2095 

eeg eeg etc ggc gee aat get ttt aae cct ctg aat gcc age gee agt 
Pro Pro Leu Gly Ala Asn Ala Phe Asn Pro Leu Asn Ala Ser Ala Ser 
2100 2105 2110 

ctg ccc get get get atg eee ata acc act get gac gga egg agt gae 
Leu Pro Ala Ala Ala Met Pro lie Thr Thr Ala Asp Gly Arg Ser Asp 
2115 2120 2125 

cac gca etc ace teg cca ggt gga ggt ggg aaa gee aag gtc tct ggc 
His Ala Leu Thr Ser Pro Gly Gly Gly Gly Lys Ala Lys Val Ser Gly 
2130 2135 2140 

aga cct age age cga aaa gcc aag teg cca gca eea ggc eta gcg tec 
Arg Pro Ser Ser Arg Lys Ala Lys Ser Pro Ala Pro Gly Leu Ala Ser 
2145 2150 2155 

gga gac cga ccc cct tct gtc tec tea gta cac tea gag ggg gac tge 
Gly Asp Arg Pro Pro Ser Val Ser Ser Val His Ser Glu Gly Asp Cys 
2160 2165 2170 2175 

aat ege cga aca cca etc acc aac cgt gtg tgg gag gae egg ccc tea 
Asn Arg Arg Thr Pro Leu Thr Asn Arg Val Trp Glu Asp Arg Pro Ser 
2180 2185 2190 
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tct gca ggg tec acg cca ttc ccc tac aac cct ttg att atg agg eta 6983 
Ser Ala Gly Ser Thr Pro Phe Pro Tyr Asn Pro Leu lie Met Arg Leu 
2195 2200 2205 

cag gca ggt gtc atg gcc tec ccg ccc cca cct ggc ctt gcg gca ggc 7031 
Gin Ala Gly Val Met Ala Ser Pro Pro Pro Pro Gly Leu Ala Ala Gly 
2210 2215 2220 

age ggg ccc eta get ggt ccc cac cae gcc tgg gat gag gag ccc aag 7079 
Ser Gly Pro Leu Ala Gly Pro His His Ala Trp Asp Glu Glu Pro Lys 
2225 2230 2235 

cca ctg ctg tgt tea cag tat gag aca etc teg gac age gag tga 7124 
Pro Leu Leu Cys Ser Gin Tyr Glu Thr Leu Ser Asp Ser Glu * 
2240 2245 2250 

ccacggattg ggggggagcg gtgccaggtc ccgeacaagg cagaagcage ccageatgga 7184 

gcagacagct getgactccc gagactgagg aaggagcece tgagtetgcc tgcgcgtcca 7244 

tccgtncgtc gtncactcat ctgtccatcc agagctggca ttctgcctgt ctaaagcctt 73 04 

aactaagact tccacccegg gctggccctg cgcagtgacc ttacactcag gggattgttt 73 64 

accttggtgc tcganaaggg ggagtggaca ggaaggggag ggacaagccg ggecangagg 7424 

SaggSSS^ca ancaattcgt gtgtcaagtc gcactcntgc t 74 65 
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1205 1210 1215 

Thr Gly Thr Lys Lys His Asp Val Arg Ser lie lie Gly Ser Pro Gly 

1220 1225 1230 

Arg Pro Phe Pro Ala Leu His Pro Leu Asp lie Met Ala Asp Ala Arg 

1235 1240 1245 

Ala Leu Glu Arg Ala Cys Tyr Glu Glu Ser Leu Lys Ser Arg Ser Gly 

1250 1255 1260 

Thr Ser Ser Gly Ala Gly Gly Ser lie Thr Arg Gly Ala Pro Val Val 
1265 1270 1275 1280 

Val Pro Glu Leu Gly Lys Pro Arg Gin Ser Pro Leu Thr Tyr Glu Asp 

1285 1290 1295 

His Gly Ala Pro Phe Thr Ser His Leu Pro Arg Gly Ser Pro Val Thr 

1300 1305 1310 

Thr Arg Glu Pro Thr Pro Arg Leu Gin Glu Gly Ser Leu Leu Ser Ser 

1315 1320 1325 

Lys Ala Ser Gin Asp Arg Lys Leu Thr Ser Thr Pro Arg Glu lie Ala 

1330 1335 1340 

Lys Ser Pro His Ser Thr Val Pro Glu His His Pro His Pro lie Ser 
1345 1350 1355 1360 

Pro Tyr Glu His Leu Leu Arg Gly Val Thr Gly Val Asp Leu Tyr Arg 

1365 1370 1375 

Gly His lie Pro Leu Ala Phe Asp Pro Thr Ser lie Pro Arg Gly lie 

1380 1385 1390 

Pro Leu Glu Ala Ala Ala Ala Ala Tyr Tyr Leu Pro Arg His Leu Ala 

1395 1400 1405 

Pro Ser Pro Thr Tyr Pro His Leu Tyr Pro Pro Tyr Leu lie Arg Gly 

1410 1415 1420 

Tyr Pro Asp Thr Ala Ala Leu Glu Asn Arg Gin Thr lie lie Asn Asp 
1425 1430 1435 1440 

Tyr lie Thr Ser Gin Gin Met His His Asn Ala Ala Ser Ala Met Ala 

1445 1450 1455 

Gin Arg Ala Asp Met Leu Arg Gly Leu Ser Pro Arg Glu Ser Ser Leu 

1460 1465 1470 

Ala Leu Asn Tyr Ala Ala Gly Pro Arg Gly lie lie Asp Leu Ser Gin 

1475 1480 1485 

Val Pro His Leu Pro Val Leu Val Pro Pro Thr Pro Gly Thr Pro Ala 

1490 1495 1500 

Thr Ala lie Asp Arg Leu Ala Tyr Leu Pro Thr Ala Pro Pro Pro Phe 
1505 1510 1515 1520 

Ser Ser Arg His Ser Ser Ser Pro Leu Ser Pro Gly Gly Pro Thr His 

1525 1530 1535 

Leu Ala Lys Pro Thr Ala Thr Ser Ser Ser Glu Arg Glu Arg Glu Arg 

1540 1545 1550 

Glu Arg Glu Arg Asp Lys Ser lie Leu Thr Ser Thr Thr Thr Val Glu 

1555 1560 1565 

His Ala Pro lie Trp Arg Pro Gly Thr Glu Gin Ser Ser Gly Ala Gly 

1570 1575 1580 

Gly Ser Ser Arg Pro Ala Ser His Thr His Gin His Ser Pro lie Ser 
1585 1590 1595 1600 

Pro Arg Thr Gin Asp Ala Leu Gin Gin Arg Pro Ser Val Leu His Asn 

1605 1610 1615 

Thr Ser Met Lys Gly Val Val Thr Ser Val Glu Pro Gly Thr Pro Thr 

1620 1625 1630 

Val Leu Arg Trp Ala Arg Ser Thr Ser Thr Ser Ser Pro Val Arg Pro 

1635 1640 1645 

Ala Ala Thr Phe Pro Pro Ala Thr His Cys Pro Leu Gly Gly Thr Leu 

1650 1655 1660 

Glu Gly Val Tyr Pro Thr Leu Met Glu Pro Val Leu Leu Pro Lys Glu 
1665 1670 1675 1680 

Thr Ser Arg Val Ala Arg Pro Glu Arg Ala Arg Val Asp Ala Gly His 
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1685 1690 1695 

Ala Phe Leu Thr Lys Pro Pro Gly Arg Gla Pro Ala Ser Ser Pro Ser 

1700 1705 1710 

Lys Ser Ser Glu Pro Arg Ser Leu Ala Pro Pro Ser Ser Ser His Thr 

1715 1720 1725 

Ala lie Ala Arg Thr Pro Ala Lys Asn Leu Ala Pro His His Ala Ser 

1730 1735 1740 

Pro Asp Pro Pro Ala Pro Thr Ser Ala Ser Asp Leu His Arg Glu Lys 
1745 1750 1755 1760 

Thr Gin Ser Lys Pro Phe Ser lie Gin Glu Leu Glu Leu Arg Ser Leu 

1765 1770 1775 

Gly Tyr His Ser Gly Ala Gly Tyr Ser Pro Asp Gly Val Glu Pro lie 

1780 1785 1790 

Ser Pro Val Ser Ser Pro Ser Leu Thr His Asp Lys Gly Leu Ser Lys 

1795 1800 1805 

Pro Leu Glu Glu Leu Glu Lys Ser His Leu Glu Gly Glu Leu Arg His 

1810 1815 1820 

Lys Gin Pro Gly Pro Met Lys Leu Ser Ala Glu Ala Ala His Leu Pro 
1825 1830 1835 1840 

His Leu Arg Pro Leu Pro Glu Ser Gin Pro Ser Ser Ser Pro Leu Leu 

1845 1850 1855 

Gin Thr Ala Pro Gly lie Lys Gly His Gin Arg Val Val Thr Leu Ala 

1860 1865 1870 

Gin His lie Ser Glu Val lie Thr Gin Asp Tyr Thr Arg His His Pro 

1875 1880 1885 

Gin Gin Leu Ser Gly Pro Leu Pro Ala Pro Leu Tyr Ser Phe Pro Gly 

1890 1895 1900 

Ala Ser Cys Pro Val Leu Asp Leu Arg Arg Pro Pro Ser Asp Leu Tyr 
1905 1910 1915 1920 

Leu Pro Pro Pro Asp His Gly Thr Pro Ala Arg Gly Ser Pro His Ser 

1925 1930 1935 

Glu Gly Gly Lys Arg Ser Pro Glu Pro Ser Lys Thr Ser Val Leu Gly 

1940 1945 1950 

Ser Ser Glu Asp Ala lie Glu Pro Val Ser Pro Pro Glu Gly Met Thr 

1955 1960 1965 

Glu Pro Gly His Ala Arg Ser Thr Ala Tyr Pro Leu Leu Tyr Arg Asp 

1970 1975 1980 

Gly Glu Gin Gly Glu Pro Arg Met Gly Leu Glu Ser Pro Gly Asn Thr 
1985 1990 1995 2000 

Ser Gin Pro Pro Thr Phe Phe Ser Lys Leu Thr Glu Ser Asn Ser Ala 

2005 2010 2015 

Met Val Lys Ser Lys Lys Gin Glu lie Asn Lys Lys Leu Asn Thr His 

2020 2025 2030 

Asn Arg Asn Glu Pro Glu Tyr Asn lie Gly Gin Pro Gly Thr Glu lie 

2035 2040 2045 

Phe Asn Met Pro Ala lie Thr Gly Ala Gly Leu Met Thr Cys Arg Ser 

2050 2055 2060 

Gin Ala Val Gin Glu His Ala Ser Thr Asn Met Gly Leu Glu Ala lie 
2065 2070 2075 2080 

lie Arg Lys Ala Leu Met Gly Lys Tyr Asp Gin Trp Glu Glu Pro Pro 

2085 2090 2095 

Pro Leu Gly Ala Asn Ala Phe Asn Pro Leu Asn Ala Ser Ala Ser Leu 

2100 2105 2110 

Pro Ala Ala Ala Met Pro lie Thr Thr Ala Asp Gly Arg Ser Asp His 

2115 2120 2125 

Ala Leu Thr Ser Pro Gly Gly Gly Gly Lys Ala Lys Val Ser Gly Arg 

2130 2135 2140 

Pro Ser Ser Arg Lys Ala Lys Ser Pro Ala Pro Gly Leu Ala Ser Gly 
2145 2150 2155 2160 

Asp Arg Pro Pro Ser Val Ser Ser Val His Ser Glu Gly Asp Cys Asn 
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2165 2170 2175 

Arg Arg Thr Pro Leu Thr Asn Arg Val Trp Glu Asp Arg Pro Ser Ser 

2180 2185 2190 

Ala Gly Ser Thr Pro Phe Pro Tyr Asn Pro Leu lie Met Arg Leu Gin 

2195 2200 2205 

Ala Gly Val Met Ala Ser Pro Pro Pro Pro Gly Leu Ala Ala Gly Ser 

2210 2215 2220 

Gly Pro Leu Ala Gly Pro His His Ala Trp Asp Glu Glu Pro Lys Pro 
2225 2230 2235 2240 

Leu Leu Cys Ser Gin Tyr Glu Thr Leu Ser Asp Ser Glu 
2245 2250 



<210> 10 

<211> 7940 

<212> DNA 

<213> Homo sapiens 

<220> 
<221> CDS 

<222> (241) . . . (7563) 



<400> 10 

ccaagatggc ggccaaggtg gcgaagcagc agccgcggcg gcggcggcgg ctggagtgag 60 

cgtccgactc gccgcgccga acgaggtccc ggtgtagggc cgcgcgccgt ggccgcgtcc 12 0 

cactcctcag gccggggcgc acgtcggctc ccacgcttag ccagctcccg gtggtttcct 180 

agaaacatga ttgtttattg gcattgatct cacagtctgg tgaggacttc tttactgata 240 

atg tea agt tea ggt tat cat ccc aac caa gga gca ttc age aca gaa 2 88 
Met Ser Ser Ser Gly Tyr Pro Pro Asn Gin Gly Ala Phe Ser Thr Glu 
15 10 15 



caa agt cgt tat cct cot cac tct gtc cag tat aca ttt ccc aac acc 336 
Gin Ser Arg Tyr Pro Pro His Ser Val Gin Tyr Thr Phe Pro Asn Thr 
20 25 30 



cgc cac cag cag gag ttc gca gtc cct gat tat cgt tec tct cat ett 3 84 

Arg His Gin Gin Glu Phe Ala Val Pro Asp Tyr Arg Ser Ser His Leu 
35 40 45 



gaa gtg agt cag gca tea cag ett ttg cag caa eag cag cag caa cag 432 
Glu Val Ser Gin Ala Ser Gin Leu Leu Gin Gin Gin Gin Gin Gin Gin 
50 55 60 



ett ega agg cga eet tee ttg ett tea gaa ttt cac eca ggt tct gac 480 
Leu Arg Arg Arg Pro Ser Leu Leu Ser Glu Phe His Pro Gly Ser Asp 
65 70 75 80 



agg eet caa gaa agg aga act agt tat gaa ecg ttt eat eca ggc eca 52 8 

Arg Pro Gin Glu Arg Arg Thr Ser Tyr Glu Pro Phe His Pro Gly Pro 
85 90 95 



tec eca gtg gat cat gat tea ctg gaa teg aag cga eca cgt etg gaa 576 
Ser Pro Val Asp His Asp Ser Leu Glu Ser Lys Arg Pro Arg Leu Glu 
100 105 110 



eag gtt tct gat tct cat ttt cag cgt gtc agt get gcg gtt ttg cct 624 
Gin Val Ser Asp Ser His Phe Gin Arg Val Ser Ala Ala Val Leu Pro 
115 120 125 



tta gtg cac ecg ctg eca gaa ggg etg agg get tct gca gat get aag 672 
Leu Val His Pro Leu Pro Glu Gly Leu Arg Ala Ser Ala Asp Ala Lys 



13 0 



135 
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14 0 



aag gat cca gca ttc gga ggc aaa cat gaa get cca tec tct cca att 
Lys Asp Pro Ala Phe Gly Gly Lys His Glu Ala Pro Ser Ser Pro lie 
145 150 155 160 

teg ggg caa cca tgt gga gat gat caa aat get tea cet tea aaa etc 
Ser Gly Gin Pro Cys Gly Asp Asp Gin Asn Ala Ser Pro Ser Lys Leu 
165 170 175 

tea aag gaa gag tta ata cag agt atg gat cgt gta gat cga gaa att 
Ser Lys Glu Glu Leu lie Gin Ser Met Asp Arg Val Asp Arg Glu lie 
180 185 190 

gca aaa gta gaa cag cag ate ctt aaa etg aaa aag aaa caa caa cag 
Ala Lys Val Glu Gin Gin lie Leu Lys Leu Lys Lys Lys Gin Gin Gin 
195 200 205 

ctt gaa gaa gag gca get aaa cet cet gag ect gag aag ecc gtg tee 
Leu Glu Glu Glu Ala Ala Lys Pro Pro Glu Pro Glu Lys Pro Val Ser 
210 215 220 

cet cet cet gtg gag cag aaa eac egc agt att gte caa att att tat 
Pro Pro Pro Val Glu Gin Lys His Arg Ser lie Val Gin lie lie Tyr 
225 230 235 240 

gat gag aat egg aaa aaa gca gaa gaa get cat aaa att ttt gaa ggt 
Asp Glu Asn Arg Lys Lys Ala Glu Glu Ala His Lys lie Phe Glu Gly 
245 250 255 

ctt ggc cca aaa gtt gaa etg cca etg tat aac cag cca tea gat ace 
Leu Gly Pro Lys Val Glu Leu Pro Leu Tyr Asn Gin Pro Ser Asp Thr 
260 265 270 

aag gtg tac cat gag aac ate aag aea aac cag gtg atg agg aaa aaa 
Lys Val Tyr His Glu Asn lie Lys Thr Asn Gin Val Met Arg Lys Lys 
275 280 285 

etc att tta ttt ttt aaa aga aga aat cat gca aga aaa caa agg gaa 
Leu lie Leu Phe Phe Lys Arg Arg Asn His Ala Arg Lys Gin Arg Glu 
290 295 300 

caa aaa ate tge cag cgt tat gat cag etc atg gag gca tgg gag aaa 
Gin Lys lie Cys Gin Arg Tyr Asp Gin Leu Met Glu Ala Trp Glu Lys 
305 310 315 320 

aaa gtg gac aga ata gaa aat aat cet egg agg aaa get aaa gaa age 
Lys Val Asp Arg lie Glu Asn Asn Pro Arg Arg Lys Ala Lys Glu Ser 
325 330 335 

aaa aea agg gaa tac tat gaa aag cag ttt cca gaa att cga aaa caa 
Lys Thr Arg Glu Tyr Tyr Glu Lys Gin Phe Pro Glu lie Arg Lys Gin 
340 345 350 

aga gaa cag caa gaa aga ttt cag cga gtt ggg cag agg gga get ggt 
Arg Glu Gin Gin Glu Arg Phe Gin Arg Val Gly Gin Arg Gly Ala Gly 
355 360 365 

ctt tea gee ace att get agg agt gag eat gag att tct gaa att att 
Leu Ser Ala Thr lie Ala Arg Ser Glu His Glu lie Ser Glu lie lie 
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gat ggg etc tct gag cag gag aat aat gag aaa caa atg egg cag etc 
Asp Gly Leu Ser Glu Gin Glu Asn Asn Glu Lys Gin Met Arg Gin Leu 
385 390 395 400 

tct gtg att cca cct atg atg ttt gat gca gaa caa aga cga gtc aag 
Ser Val lie Pro Pro Met Met Phe Asp Ala Glu Gin Arg Arg Val Lys 
405 410 415 

ttc att aac atg aat ggg ctt atg gag gac cct atg aaa gtg tat aaa 
Phe lie Asn Met Asn Gly Leu Met Glu Asp Pro Met Lys Val Tyr Lys 
420 425 430 

gat agg cag ttt atg aat gtt tgg act gac cat gaa aag gag ate ttt 
Asp Arg Gin Phe Met Asn Val Trp Thr Asp His Glu Lys Glu lie Phe 
435 440 445 

aag gac aag ttt ate cag cat cca aaa aac ttt gga eta att gca tea 
Lys Asp Lys Phe lie Gin His Pro Lys Asn Phe Gly Leu lie Ala Ser 
450 455 460 

tac ttg gag agg aag agt gtt cct gat tgt gtt ttg tat tac tat tta 
Tyr Leu Glu Arg Lys Ser Val Pro Asp Cys Val Leu Tyr Tyr Tyr Leu 
465 470 475 480 

acc aag aaa aat gag aat tat aaa gcc etc gtc aga agg aat tat ggg 
Thr Lys Lys Asn Glu Asn Tyr Lys Ala Leu Val Arg Arg Asn Tyr Gly 
485 490 495 

aaa cgc aga ggc aga aac cag caa att get ega eee teg caa gaa gaa 
Lys Arg Arg Gly Arg Asn Gin Gin lie Ala Arg Pro Ser Gin Glu Glu 
500 505 510 

aaa gta gaa gaa aaa gaa gag gat aaa gca gaa aaa aea gaa aaa aaa 
Lys Val Glu Glu Lys Glu Glu Asp Lys Ala Glu Lys Thr Glu Lys Lys 
515 520 525 

gaa gaa gaa aag aaa gat gaa gag gaa aaa gat gaa aaa gaa gac tec 
Glu Glu Glu Lys Lys Asp Glu Glu Glu Lys Asp Glu Lys Glu Asp Ser 
530 535 540 

aaa gaa aat acc aag gaa aag gac aag ata gat ggt aea gea gaa gaa 
Lys Glu Asn Thr Lys Glu Lys Asp Lys lie Asp Gly Thr Ala Glu Glu 
545 550 555 560 

act gag gaa aga gag caa gee aea eee egg ggg ega aag act gee aac 
Thr Glu Glu Arg Glu Gin Ala Thr Pro Arg Gly Arg Lys Thr Ala Asn 
565 570 575 

agt cag ggc cgc cgt aag ggc egg ate acc agg tec atg aea aac gaa 
Ser Gin Gly Arg Arg Lys Gly Arg lie Thr Arg Ser Met Thr Asn Glu 
580 585 590 

get gea get gcc agt get gca gee gca gcg get act gaa gag ecc cca 
Ala Ala Ala Ala Ser Ala Ala Ala Ala Ala Ala Thr Glu Glu Pro Pro 
595 600 605 



cca cct etg cca eeg cca cca gaa eee att tct aea gag cct gtg gag 
Pro Pro Leu Pro Pro Pro Pro Glu Pro lie Ser Thr Glu Pro Val Glu 



2112 



610 



615 
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acc tct cga tgg aca gaa gaa gaa atg gaa gtt get aaa aaa ggt eta 
Thr Ser Arg Trp Thr Glu Glu Glu Met Glu Val Ala Lys Lys Gly Leu 
625 630 635 640 

gta gaa cat ggt cgt aac tgg gca gca att get aaa atg gtg gga acg 
Val Glu His Gly Arg Asn Trp Ala Ala He Ala Lys Met Val Gly Thr 
645 650 655 

aaa agt gaa get eaa tgt aaa aae tte tat ttt aac tat aaa agg ega 
Lys Ser Glu Ala Gin Cys Lys Asn Phe Tyr Phe Asn Tyr Lys Arg Arg 
660 665 670 

cac aat ctt gae aae etc tta cag eag eat aaa cag aaa aet tea cga 
His Asn Leu Asp Asn Leu Leu Gin Gin His Lys Gin Lys Thr Ser Arg 
675 680 685 

aaa cct cgt gaa gag ega gat gtg tct eaa tgt gaa agt gtc get tee 
Lys Pro Arg Glu Glu Arg Asp Val Ser Gin Cys Glu Ser Val Ala Ser 
690 695 700 

act gtt tet get cag gag gat gaa gat att gaa gee tee aat gaa gaa 
Thr Val Ser Ala Gin Glu Asp Glu Asp lie Glu Ala Ser Asn Glu Glu 
705 710 715 720 

gaa aat cca gaa gae age gaa gtt gaa get gtc aag ccc age gag gac 
Glu Asn Pro Glu Asp Ser Glu Val Glu Ala Val Lys Pro Ser Glu Asp 
725 730 735 

agt cct gaa aat get aet tct cga gga aae aca gaa cct gcg gtt gag 
Ser Pro Glu Asn Ala Thr Ser Arg Gly Asn Thr Glu Pro Ala Val Glu 
740 745 750 

ctt gag cee ace acg gaa aet gca ccc agt aca tet ccc tec tta gca 
Leu Glu Pro Thr Thr Glu Thr Ala Pro Ser Thr Ser Pro Ser Leu Ala 
755 760 765 

gtt cca agt aca aaa cca get gaa gat gaa agt gtg gag acc cag gtg 
Val Pro Ser Thr Lys Pro Ala Glu Asp Glu Ser Val Glu Thr Gin Val 
770 775 780 

aat gae age ate agt get gag aca gca gag cag atg gat gta gat cag 
Asn Asp Ser He Ser Ala Glu Thr Ala Glu Gin Met Asp Val Asp Gin 
785 790 795 800 

eag gag cac agt get gaa gag ggt tet gtt tgt gat cee cca ccc get 
Gin Glu His Ser Ala Glu Glu Gly Ser Val Cys Asp Pro Pro Pro Ala 
805 810 815 

acc aaa get gae tct gtg gac gtt gaa gtg agg gtg eca gaa aac cat 
Thr Lys Ala Asp Ser Val Asp Val Glu Val Arg Val Pro Glu Asn His 
820 825 830 

gca tct aaa gtt gaa ggt gat aat acc aaa gaa aga gac ttg gat aga 
Ala Ser Lys Val Glu Gly Asp Asn Thr Lys Glu Arg Asp Leu Asp Arg 
835 840 845 

gee agt gag aag gtg gaa cct aga gat gaa gat ttg gtg gta get cag 
Ala Ser Glu Lys Val Glu Pro Arg Asp Glu Asp Leu Val Val Ala Gin 



850 



855 
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860 



caa ata aat gcc caa agg ccc gag ccc cag tea gac aat gat tec agt 
Gin lie Asn Ala Gin Arg Pro Glu Pro Gin Ser Asp Asn Asp Ser Ser 
865 870 875 880 

gcc acg tgc age get gat gag gat gtg gat gga gag cca gag agg cag 
Ala Thr Cys Ser Ala Asp Glu Asp Val Asp Gly Glu Pro Glu Arg Gin 
885 890 895 

aga atg ttt cct atg gac tea aag ect tea ctg tta aac ccc act gga 
Arg Met Phe Pro Met Asp Ser Lys Pro Ser Leu Leu Asn Pro Thr Gly 
900 905 910 

tct ata etc gte tea tet ceg tta aaa cca aat cca ctg gat ctg cca 
Ser lie Leu Val Ser Ser Pro Leu Lys Pro Asn Pro Leu Asp Leu Pro 
915 920 925 

cag ctt cag cat cga get get gtt ate cea cca atg gta tec tgc ace 
Gin Leu Gin His Arg Ala Ala Val He Pro Pro Met Val Ser Cys Thr 
930 935 940 

cca tgt aac ata cca att gga ace cca gtg age ggc tat get etc tac 
Pro Cys Asn He Pro He Gly Thr Pro Val Ser Gly Tyr Ala Leu Tyr 
945 950 955 960 

cag cga cac att aaa gca atg cat gag tea gca etc ctg gag gag cag 

Gin Arg His He Lys Ala Met His Glu Ser Ala Leu Leu Glu Glu Gin 
965 970 975 

egg cag aga caa gaa cag ata gat ttg gaa tgt aga agt tct aca agt 
Arg Gin Arg Gin Glu Gin He Asp Leu Glu Cys Arg Ser Ser Thr Ser 
980 985 990 

cea tgt gge aca tec aag agt cca aac aga gag tgg gaa gtc ctt cag 
Pro Cys Gly Thr Ser Lys Ser Pro Asn Arg Glu Trp Glu Val Leu Gin 
995 1000 1005 

cct get cca cat caa ttg ata act aat etc cct gaa ggc gtt egg ctt 
Pro Ala Pro His Gin Leu He Thr Asn Leu Pro Glu Gly Val Arg Leu 
1010 1015 1020 

ceg aca act cga cea ace agg cca ccg ccc cct etc ate ceg tea tec 
Pro Thr Thr Arg Pro Thr Arg Pro Pro Pro Pro Leu He Pro Ser Ser 
1025 1030 1035 1040 

aaa ace aca gtg get tea gaa aaa cca tct ttt ata atg gga ggc tec 
Lys Thr Thr Val Ala Ser Glu Lys Pro Ser Phe He Met Gly Gly Ser 
1045 1050 1055 

ate tea cag gga aca cca ggc act tat ttg act tct cat aat cag get 
He Ser Gin Gly Thr Pro Gly Thr Tyr Leu Thr Ser His Asn Gin Ala 
1060 1065 1070 

tec tac act caa gaa aca ccc aag ceg tea gta gga tct ate tct ctt 
Ser Tyr Thr Gin Glu Thr Pro Lys Pro Ser Val Gly Ser He Ser Leu 
1075 1080 1085 

gga ctg cca egg caa cag gaa tet gcc aaa tea get act ttg ccc tac 
Gly Leu Pro Arg Gin Gin Glu Ser Ala Lys Ser Ala Thr Leu Pro Tyr 



1090 



1095 
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1100 



ate aag cag gaa gaa ttt tct ccc cga age caa aac tea caa cct gag 
lie Lys Gin Glu Glu Phe Ser Pro Arg Ser Gin Asn Ser Gin Pro Glu 
1105 1110 1115 1120 

ggt ctg ttg gtc agg gcc caa cat gaa ggt gta gtc aga ggt acc gca 
Gly Leu Leu Val Arg Ala Gin His Glu Gly Val Val Arg Gly Thr Ala 
1125 1130 1135 

gga gcc ata caa gaa gga agt ata act egg gga act eea acc age aaa 
Gly Ala lie Gin Glu Gly Ser lie Thr Arg Gly Thr Pro Thr Ser Lys 
1140 1145 1150 

att tea gtg gag age att eea tec eta egg ggc tet ate act cag ggc 
lie Ser Val Glu Ser lie Pro Ser Leu Arg Gly Ser lie Thr Gin Gly 
1155 1160 1165 

ace ccg get ctg ccc cag act ggc ata eea aca gag get ttg gtg aag 
Thr Pro Ala Leu Pro Gin Thr Gly lie Pro Thr Glu Ala Leu Val Lys 
1170 1175 1180 

ggg tec att teg aga atg ccc att gaa gae age agt cct gag aaa ggc 
Gly Ser lie Ser Arg Met Pro lie Glu Asp Ser Ser Pro Glu Lys Gly 
1185 1190 1195 1200 

aga gag gaa get gca tec aaa ggc cat gtt att tat gaa ggc aaa agt 
Arg Glu Glu Ala Ala Ser Lys Gly His Val He Tyr Glu Gly Lys Ser 
1205 1210 1215 

gga eat ate ttg tea tat gat aat att aag aat gee cga gaa ggg act 
Gly His He Leu Ser Tyr Asp Asn He Lys Asn Ala Arg Glu Gly Thr 
1220 1225 1230 

agg agt eea aga aca get eat gaa ate agt tta aag aga age tat gaa 
Arg Ser Pro Arg Thr Ala His Glu He Ser Leu Lys Arg Ser Tyr Glu 
1235 1240 1245 

tea gtg gaa gga aat ata aag caa ggg atg tea atg agg gag tct cct 
Ser Val Glu Gly Asn He Lys Gin Gly Met Ser Met Arg Glu Ser Pro 
1250 1255 1260 

gta tea gca ccg tta gag ggg ctg ata tgc cga gea tta ccc agg ggg 
Val Ser Ala Pro Leu Glu Gly Leu He Cys Arg Ala Leu Pro Arg Gly 
1265 1270 1275 1280 

agt cct cat tct gae etc aaa gaa agg act gta ttg tet ggc tec ata 
Ser Pro His Ser Asp Leu Lys Glu Arg Thr Val Leu Ser Gly Ser He 
1285 1290 1295 

atg cag ggg aca eea aga gca aca act gaa age ttt gaa gat ggc ctt 
Met Gin Gly Thr Pro Arg Ala Thr Thr Glu Ser Phe Glu Asp Gly Leu 
1300 1305 1310 

aaa tat ccc aaa caa att aaa agg gaa agt cct ccc ata cga gca ttt 
Lys Tyr Pro Lys Gin He Lys Arg Glu Ser Pro Pro He Arg Ala Phe 
1315 1320 1325 

gaa ggt gee att acc aaa gga aaa eea tat gat ggc ate ace acc ate 
Glu Gly Ala He Thr Lys Gly Lys Pro Tyr Asp Gly He Thr Thr He 



1330 



1335 
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1340 



aaa gaa atg ggg cgt tec att cat gag att cca agg caa gat att tta 
Lys Glu Met Gly Arg Ser lie His Glu lie Pro Arg Gin Asp lie Leu 
1345 1350 1355 1360 

act cag gaa agt egg aaa act cca gaa gtg gtc cag age aca egg ccg 
Thr Gin Glu Ser Arg Lys Thr Pro Glu Val Val Gin Ser Thr Arg Pro 
1365 1370 1375 

ata att gag ggt tec att tec cag ggc aca cca ata aag ttt gac aac 
lie lie Glu Gly Ser lie Ser Gin Gly Thr Pro lie Lys Phe Asp Asn 
1380 1385 1390 

aac tea ggt caa tct gcc ate aaa eae aat gte aaa tec tta ate aeg 
Asn Ser Gly Gin Ser Ala lie Lys His Asn Val Lys Ser Leu lie Thr 
1395 1400 1405 

ggg ect age aaa eta tec cgt gga atg ect ccg ctg gaa att gtg cca 
Gly Pro Ser Lys Leu Ser Arg Gly Met Pro Pro Leu Glu lie Val Pro 
1410 1415 1420 

gag aac ata aaa gtg gta gaa egg gga aaa tat gag gat gtg aaa gea 
Glu Asn lie Lys Val Val Glu Arg Gly Lys Tyr Glu Asp Val Lys Ala 
1425 1430 1435 1440 

ggc gag ace gtg cgt tec egg cac acg tea gtg gta age tct ggc cec 
Gly Glu Thr Val Arg Ser Arg His Thr Ser Val Val Ser Ser Gly Pro 
1445 1450 1455 

tee gtt ett agg tee aca ctg eat gaa get ecc aaa gea caa ctg age 
Ser Val Leu Arg Ser Thr Leu His Glu Ala Pro Lys Ala Gin Leu Ser 
1460 1465 1470 

ect ggg att tat gat gac ace agt gea egg agg ace ect gtg agt tat 
Pro Gly lie Tyr Asp Asp Thr Ser Ala Arg Arg Thr Pro Val Ser Tyr 
1475 1480 1485 

caa aac ace atg tee aga ggc tea cec atg atg aac aga act tct gat 
Gin Asn Thr Met Ser Arg Gly Ser Pro Met Met Asn Arg Thr Ser Asp 
1490 1495 1500 

gtt aca att ect ect aac aag tct ace aat eat gaa agg aaa teg aca 
Val Thr lie Pro Pro Asn Lys Ser Thr Asn His Glu Arg Lys Ser Thr 
1505 1510 1515 1520 

ctg ace ect ace cag agg gaa agt ate cca geg aag tct cca gtg ect 
Leu Thr Pro Thr Gin Arg Glu Ser lie Pro Ala Lys Ser Pro Val Pro 
1525 1530 1535 

ggg gtg gac ect gtc gtg age eae agt ccg ttt gat cec eat cac aga 
Gly Val Asp Pro Val Val Ser His Ser Pro Phe Asp Pro His His Arg 
1540 1545 1550 

ggc age act gea ggc gag gtt tat tgg age cac ctg cec acg caa ttg 
Gly Ser Thr Ala Gly Glu Val Tyr Trp Ser His Leu Pro Thr Gin Leu 
1555 1560 1565 

gat cea gee atg ect ttt eae agg get ttg gat ect gea gcg get get 
Asp Pro Ala Met Pro Phe His Arg Ala Leu Asp Pro Ala Ala Ala Ala 



1570 



1575 
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1580 



tac ctg ttt cag aga cag ctt tea cca act cca ggt tac cca agt cag 
Tyr Leu Phe Gin Arg Gin Leu Ser Pro Thr Pro Gly Tyr Pro Ser Gin 
1585 1590 1595 1600 

tat cag ctt tac gca atg gag aac aca aga cag aca ate tta aat gat 
Tyr Gin Leu Tyr Ala Met Glu Asn Thr Arg Gin Thr lie Leu Asn Asp 
1605 1610 1615 

tac att acc tea caa cag atg caa gtg aac ttg cgt cca gat gtg gcc 
Tyr lie Thr Ser Gin Gin Met Gin Val Asn Leu Arg Pro Asp Val Ala 
1620 1625 1630 

aga gga etc tec cca aga gag cag cca ctg ggt etc cca tac cca gca 
Arg Gly Leu Ser Pro Arg Glu Gin Pro Leu Gly Leu Pro Tyr Pro Ala 
1635 1640 1645 

acg aga gga ate att gac ctg acc aat atg cct cca aca att tta gtg 
Thr Arg Gly lie lie Asp Leu Thr Asn Met Pro Pro Thr lie Leu Val 
1650 1655 1660 

cct cat cca ggg gga aca age act cct ccc atg gac aga ate act tat 
Pro His Pro Gly Gly Thr Ser Thr Pro Pro Met Asp Arg lie Thr Tyr 
1665 1670 1675 1680 

att cct ggt aca cag att act tte cct ccc agg ccg tac aac tet get 
lie Pro Gly Thr Gin lie Thr Phe Pro Pro Arg Pro Tyr Asn Ser Ala 
1685 1690 1695 

tec atg tct cca gga cae cea aca cae ctt gca get get gca agt get 
Ser Met Ser Pro Gly His Pro Thr His Leu Ala Ala Ala Ala Ser Ala 
1700 1705 1710 

gag agg gaa egg gaa egg gag egg gag aag gag egg gag egg gaa egg 
Glu Arg Glu Arg Glu Arg Glu Arg Glu Lys Glu Arg Glu Arg Glu Arg 
1715 1720 1725 

att get gca get tec tee gac etc tac ctg egg cca gge tea gaa cag 
lie Ala Ala Ala Ser Ser Asp Leu Tyr Leu Arg Pro Gly Ser Glu Gin 
1730 1735 1740 

cct ggc cga cct gge agt cat gga tat gtt ege tec cct tec cct tea 
Pro Gly Arg Pro Gly Ser His Gly Tyr Val Arg Ser Pro Ser Pro Ser 
1745 1750 1755 1760 

gta aga act cag gag acc atg ttg caa cag aga ccc agt gtt tte caa 
Val Arg Thr Gin Glu Thr Met Leu Gin Gin Arg Pro Ser Val Phe Gin 
1765 1770 1775 

gga acc aat gga ace agt gta ate aca cct ttg gat cca act get cag 
Gly Thr Asn Gly Thr Ser Val lie Thr Pro Leu Asp Pro Thr Ala Gin 
1780 1785 1790 

eta cga ate atg cca ctg cct get ggg ggc cet tea ata age caa gge 
Leu Arg He Met Pro Leu Pro Ala Gly Gly Pro Ser lie Ser Gin Gly 
1795 1800 1805 

Ctg cca gcc tec cgt tac aac act get gcg gat gcc ctg get get ctt 
Leu Pro Ala Ser Arg Tyr Asn Thr Ala Ala Asp Ala Leu Ala Ala Leu 



1810 



1815 
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1820 



gtg gat get gca get tct gca ccc cag atg gat gtg tec aaa aca aaa 
Val Asp Ala Ala Ala Ser Ala Pro Gin Met Asp Val Ser Lys Thr Lys 
1825 1830 1835 1840 

gag agt aag cat gaa get gee agg tta gaa gaa aat ttg aga age agg 
Glu Ser Lys His Glu Ala Ala Arg Leu Glu Glu Asn Leu Arg Ser Arg 
1845 1850 1855 

tea gca gca gtt agt gaa cag cag cag eta gag cag aaa ace ctg gag 
Ser Ala Ala Val Ser Glu Gin Gin Gin Leu Glu Gin Lys Thr Leu Glu 
1860 1865 1870 

gtg gag aag aga tct gtt cag tgt tta tac act tct tea gee ttt eca 
Val Glu Lys Arg Ser Val Gin Cys Leu Tyr Thr Ser Ser Ala Phe Pro 
1875 1880 1885 

agt ggc aag ccc cag cct cat tct tea gta gtt tat tct gag get ggg 
Ser Gly Lys Pro Gin Pro His Ser Ser Val Val Tyr Ser Glu Ala Gly 
1890 1895 1900 

aaa gat aaa ggg cct cct eca aaa tec aga tat gag gaa gag eta agg 
Lys Asp Lys Gly Pro Pro Pro Lys Ser Arg Tyr Glu Glu Glu Leu Arg 
1905 1910 1915 1920 

ace aga ggg aag act ace att act gca get aac ttc ata gae gtg ate 
Thr Arg Gly Lys Thr Thr lie Thr Ala Ala Asn Phe lie Asp Val lie 
1925 1930 1935 

ate ace egg caa att gee teg gae aag gat geg agg gaa cgt ggc tct 
lie Thr Arg Gin lie Ala Ser Asp Lys Asp Ala Arg Glu Arg Gly Ser 
1940 1945 1950 

caa agt tea gae tct tct agt age tta tct tct cac agg tat gaa aca 
Gin Ser Ser Asp Ser Ser Ser Ser Leu Ser Ser His Arg Tyr Glu Thr 
1955 1960 1965 

cct age gat get att gag gtg ata agt cct gcc age tea cct geg cea 
Pro Ser Asp Ala lie Glu Val lie Ser Pro Ala Ser Ser Pro Ala Pro 
1970 1975 1980 

ccc cag gag aaa ctg cag ace tat cag eca gag gtt gtt aag gca aat 
Pro Gin Glu Lys Leu Gin Thr Tyr Gin Pro Glu Val Val Lys Ala Asn 
1985 1990 1995 2000 

caa geg gaa aat gat cet aec aga caa tat gaa gga cea tta cat cac 
Gin Ala Glu Asn Asp Pro Thr Arg Gin Tyr Glu Gly Pro Leu His His 
2005 2010 2015 

tat cga cea cag cag gaa tea cea tct ccc caa caa cag ctg ccc cct 
Tyr Arg Pro Gin Gin Glu Ser Pro Ser Pro Gin Gin Gin Leu Pro Pro 
2020 2025 2030 

tct tea cag gca gag gga atg ggg caa gtg ccc agg ace cat egg ctg 
Ser Ser Gin Ala Glu Gly Met Gly Gin Val Pro Arg Thr His Arg Leu 
2035 2040 2045 

ate aca ett get gat cac ate tgt caa att ate aca caa gat ttt get 
lie Thr Leu Ala Asp His lie Cys Gin lie lie Thr Gin Asp Phe Ala 



2050 



2055 



61 

2060 



aga aat caa gtt tec teg cag act ccc cag cag cct cct act tct aca 
Arg Asn Gin Val Ser Ser Gin Thr Pro Gin Gin Pro Pro Thr Ser Thr 
2065 2070 2075 2080 

ttc cag aac tea cct tct get ttg gta tct aca cct gtg agg act aaa 
Phe Gin Asn Ser Pro Ser Ala Leu Val Ser Thr Pro Val Arg Thr Lys 
2085 2090 2095 

aca tea aac cgt tac age cca gaa tec cag get cag tct gtc eat cat 
Thr Ser Asn Arg Tyr Ser Pro Glu Ser Gin Ala Gin Ser Val His His 
2100 2105 2110 

caa aga cca ggt tea agg gtc tct cca gaa aat ctt gtg gae aaa tec 
Gin Arg Pro Gly Ser Arg Val Ser Pro Glu Asn Leu Val Asp Lys Ser 
2115 2120 2125 

agg gga agt agg cct gga aaa tec cca gag agg agt cac gtc tct tec 
Arg Gly Ser Arg Pro Gly Lys Ser Pro Glu Arg Ser His Val Ser Ser 
2130 2135 2140 

gag ccc tac gag ccc ate tec cca ccc cag gtt ccg gtt gtg cat gag 
Glu Pro Tyr Glu Pro lie Ser Pro Pro Gin Val Pro Val Val His Glu 
2145 2150 2155 2160 

aaa cag gae age ttg ctg etc ttg tct cag agg gge gca gag cct gca 
Lys Gin Asp Ser Leu Leu Leu Leu Ser Gin Arg Gly Ala Glu Pro Ala 
2165 2170 2175 

gag cag agg aat gat gee cgc tea cca ggg agt ata age tac ttg cct 
Glu Gin Arg Asn Asp Ala Arg Ser Pro Gly Ser lie Ser Tyr Leu Pro 
2180 2185 2190 

tea ttc ttc ace aag ctt gaa aat aca tea ccc atg gtt aaa tea aag 
Ser Phe Phe Thr Lys Leu Glu Asn Thr Ser Pro Met Val Lys Ser Lys 
2195 2200 2205 

aag cag gag att ttt cgt aag ttg aac tec tct ggt gga ggt gae tct 
Lys Gin Glu lie Phe Arg Lys Leu Asn Ser Ser Gly Gly Gly Asp Ser 
2210 2215 2220 

gat atg gca get get cag cca gga act gag ate ttt aat ctg cca gca 
Asp Met Ala Ala Ala Gin Pro Gly Thr Glu lie Phe Asn Leu Pro Ala 
2225 2230 2235 2240 

gtt act aeg tea gge tea gtt age tct aga gge cat tct ttt get gat 
Val Thr Thr Ser Gly Ser Val Ser Ser Arg Gly His Ser Phe Ala Asp 
2245 2250 2255 

cct gee agt aat ctt ggg ctg gaa gae att ate agg aag get etc atg 
Pro Ala Ser Asn Leu Gly Leu Glu Asp lie lie Arg Lys Ala Leu Met 
2260 2265 2270 

gga age ttt gat gae aaa gtt gag gat cat gga gtt gtc atg tec cag 
Gly Ser Phe Asp Asp Lys Val Glu Asp His Gly Val Val Met Ser Gin 
2275 2280 2285 

cct atg gga gta gtg cct ggt act gcc aac ace tea gtt gtg ace agt 
Pro Met Gly Val Val Pro Gly Thr Ala Asn Thr Ser Val Val Thr Ser 
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ggt gag aca cga aga gag gaa ggg gac cca tea cct cat tea gga gga 7200 
Gly Glu Thr Arg Arg Glu Glu Gly Asp Pro Ser Pro His Ser Gly Gly 
2305 2310 2315 2320 

gtt tgc aaa cca aag ctg ate age aag tea aac age agg aaa tct aag 724 8 

Val Cys Lys Pro Lys Leu lie Ser Lys Ser Asn Ser Arg Lys Ser Lys 
2325 2330 2335 

tct cct ata cct ggg caa ggc tac tta gga acg gaa egg ccc tct tea 72 96 

Ser Pro lie Pro Gly Gin Gly Tyr Leu Gly Thr Glu Arg Pro Ser Ser 
2340 2345 2350 

gte tec tet gta eat tea gaa ggg gat tac eat agg eag acg cca ggg 7344 
Val Ser Ser Val His Ser Glu Gly Asp Tyr His Arg Gin Thr Pro Gly 
2355 2360 2365 

tgg gee tgg gaa gae agg cec tct tea aca ggc tea act eag ttt cct 7392 
Trp Ala Trp Glu Asp Arg Pro Ser Ser Thr Gly Ser Thr Gin Phe Pro 
2370 2375 2380 

tat aac cct ctg act atg egg atg etc age agt act cca cca aca ecg 7440 
Tyr Asn Pro Leu Thr Met Arg Met Leu Ser Ser Thr Pro Pro Thr Pro 
2385 2390 2395 2400 

att gca tgt get ccc tct gcg gtg aac caa gca get cct cac caa eag 7488 
lie Ala Cys Ala Pro Ser Ala Val Asn Gin Ala Ala Pro His Gin Gin 
2405 2410 2415 

aac agg ate tgg gag cga gag cct gee cca ctg etc tea gca cag tac 7536 
Asn Arg lie Trp Glu Arg Glu Pro Ala Pro Leu Leu Ser Ala Gin Tyr 
2420 2425 2430 

gag acc ctg teg gat agt gat gac tga actgeacaaa gtgaggggaa 7583 
Glu Thr Leu Ser Asp Ser Asp Asp * 
2435 2440 

cagggtgcag gagagggatc tctagttttt gtggtttaat ttttagtagc aggtcaaaaa 7643 

cctgccctcc tgtgacttat tceetgagac ttttcaggag agecagecca cagatgatga 7703 

agaaatgatg gaagttcatt tggagagtea aatgggaaaa aaaeaaacaa aaaactgcct 77 63 

ttgataeagg eaatteagtg gactataata atagtggagg gttgagatgt agagttttta 7823 

aaaagtgaae agttgctgtt cttacatctg taaagaaaae eataatgtet ttaaatcact 7 8 83 

cttctgtaaa tagatgacet ttttgeagtg taaaaaaaaa aaaaaaaaaa aaaaaaa 794 0 



<210> 11 
<211> 2440 
<212> PRT 

<213> Homo sapiens 



<400> 11 

Met Ser Ser Ser Gly Tyr Pro Pro 

1 5 
Gin Ser Arg Tyr Pro Pro His Ser 
20 

Arg His Gin Gin Glu Phe Ala Val 

35 40 
Glu Val Ser Gin Ala Ser Gin Leu 

50 55 
Leu Arg Arg Arg Pro Ser Leu Leu 



Asn Gin Gly Ala Phe Ser Thr Glu 

10 15 
Val Gin Tyr Thr Phe Pro Asn Thr 
25 30 
Pro Asp Tyr Arg Ser Ser His Leu 
45 

Leu Gin Gin Gin Gin Gin Gin Gin 
60 

Ser Glu Phe His Pro Gly Ser Asp 
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Arg Pro Gin Glu 

Ser Pro Val Asp 
100 

Gin Val Ser Asp 
115 

Leu Val His Pro 
130 

Lys Asp Pro Ala 
14 5 

Ser Gly Gin Pro 

Ser Lys Glu Glu 
180 

Ala Lys Val Glu 
195 

Leu Glu Glu Glu 
210 

Pro Pro Pro Val 
225 

Asp Glu Asn Arg 

Leu Gly Pro Lys 
260 

Lys Val Tyr His 
275 

Leu lie Leu Phe 
290 

Gin Lys lie Cys 
305 

Lys Val Asp Arg 

Lys Thr Arg Glu 
340 

Arg Glu Gin Gin 
355 

Leu Ser Ala Thr 
370 

Asp Gly Leu Ser 
385 

Ser Val lie Pro 

Phe lie Asn Met 
420 

Asp Arg Gin Phe 
435 

Lys Asp Lys Phe 
450 

Tyr Leu Glu Arg 
465 

Thr Lys Lys Asn 

Lys Arg Arg Gly 
500 

Lys Val Glu Glu 
515 

Glu Glu Glu Lys 
530 

Lys Glu Asn Thr 
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Arg Arg Thr Ser 
85 

His Asp Ser Leu 

Ser His Phe Gin 
120 

Leu Pro Glu Gly 
135 

Phe Gly Gly Lys 
150 

Cys Gly Asp Asp 
165 

Leu lie Gin Ser 

Gin Gin lie Leu 
200 

Ala Ala Lys Pro 
215 

Glu Gin Lys His 
230 

Lys Lys Ala Glu 
245 

Val Glu Leu Pro 

Glu Asn lie Lys 
280 

Phe Lys Arg Arg 
295 

Gin Arg Tyr Asp 
310 

lie Glu Asn Asn 
325 

Tyr Tyr Glu Lys 

Glu Arg Phe Gin 
360 

lie Ala Arg Ser 
375 

Glu Gin Glu Asn 
390 

Pro Met Met Phe 
405 

Asn Gly Leu Met 

Met Asn Val Trp 
440 

lie Gin His Pro 
455 

Lys Ser Val Pro 
470 

Glu Asn Tyr Lys 
485 

Arg Asn Gin Gin 

Lys Glu Glu Asp 
520 

Lys Asp Glu Glu 
535 

Lys Glu Lys Asp 
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75 

Tyr Glu Pro Phe 
90 

Glu Ser Lys Arg 
105 

Arg Val Ser Ala 

Leu Arg Ala Ser 
140 

His Glu Ala Pro 
155 

Gin Asn Ala Ser 
170 

Met Asp Arg Val 
185 

Lys Lexi Lys Lys 

Pro Glu Pro Glu 
220 

Arg Ser lie Val 
235 

Glu Ala His Lys 
250 

Leu Tyr Asn Gin 
265 

Thr Asn Gin Val 

Asn His Ala Arg 
300 

Gin Leu Met Glu 
315 

Pro Arg Arg Lys 
330 

Gin Phe Pro Glu 
345 

Arg Val Gly Gin 

Glu His Glu He 
380 

Asn Glu Lys Gin 
395 

Asp Ala Glu Gin 
410 

Glu Asp Pro Met 
425 

Thr Asp His Glu 

Lys Asn Phe Gly 
460 

Asp Cys Val Leu 
475 

Ala Leu Val Arg 
490 

He Ala Arg Pro 
505 

Lys Ala Glu Lys 

Glu Lys Asp Glu 
540 

Lys He Asp Gly 



80 

His Pro Gly Pro 
95 

Pro Arg Leu Glu 
110 

Ala Val Leu Pro 
125 

Ala Asp Ala Lys 

Ser Ser Pro He 
160 

Pro Ser Lys Leu 
175 

Asp Arg Glu He 
190 

Lys Gin Gin Gin 
205 

Lys Pro Val Ser 

Gin He He Tyr 
240 

He Phe Glu Gly 
255 

Pro Ser Asp Thr 
270 

Met Arg Lys Lys 
285 

Lys Gin Arg Glu 

Ala Trp Glu Lys 
320 

Ala Lys Glu Ser 
335 

He Arg Lys Gin 
350 

Arg Gly Ala Gly 
365 

Ser Glu He He 

Met Arg Gin Leu 
400 

Arg Arg Val Lys 
415 

Lys Val Tyr Lys 
430 

Lys Glu He Phe 
445 

Leu He Ala Ser 

Tyr Tyr Tyr Leu 
480 

Arg Asn Tyr Gly 
495 

Ser Gin Glu Glu 
510 

Thr Glu Lys Lys 
525 

Lys Glu Asp Ser 
Thr Ala Glu Glu 
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545 550 555 560 

Thr Glu Glu Arg Glu Gin Ala Thr Pro Arg Gly Arg Lys Thr Ala Asn 

565 570 575 

Ser Gin Gly Arg Arg Lys Gly Arg He Thr Arg Ser Met Thr Asn Glu 

580 585 590 

Ala Ala Ala Ala Ser Ala Ala Ala Ala Ala Ala Thr Glu Glu Pro Pro 

595 600 605 

Pro Pro Leu Pro Pro Pro Pro Glu Pro He Ser Thr Glu Pro Val Glu 

610 615 620 

Thr Ser Arg Trp Thr Glu Glu Glu Met Glu Val Ala Lys Lys Gly Leu 
625 630 635 640 

Val Glu His Gly Arg Asn Trp Ala Ala He Ala Lys Met Val Gly Thr 

645 650 655 

Lys Ser Glu Ala Gin Cys Lys Asn Phe Tyr Phe Asn Tyr Lys Arg Arg 

660 665 670 

His Asn Leu Asp Asn Leu Leu Gin Gin His Lys Gin Lys Thr Ser Arg 

675 680 685 

Lys Pro Arg Glu Glu Arg Asp Val Ser Gin Cys Glu Ser Val Ala Ser 

690 695 700 

Thr Val Ser Ala Gin Glu Asp Glu Asp He Glu Ala Ser Asn Glu Glu 
705 710 715 720 

Glu Asn Pro Glu Asp Ser Glu Val Glu Ala Val Lys Pro Ser Glu Asp 

725 730 735 

Ser Pro Glu Asn Ala Thr Ser Arg Gly Asn Thr Glu Pro Ala Val Glu 

740 745 750 

Leu Glu Pro Thr Thr Glu Thr Ala Pro Ser Thr Ser Pro Ser Leu Ala 

755 750 765 

Val Pro Ser Thr Lys Pro Ala Glu Asp Glu Ser Val Glu Thr Gin Val 

770 775 780 

Asn Asp Ser He Ser Ala Glu Thr Ala Glu Gin Met Asp Val Asp Gin 
785 790 795 800 

Gin Glu His Ser Ala Glu Glu Gly Ser Val Cys Asp Pro Pro Pro Ala 

805 810 815 

Thr Lys Ala Asp Ser Val Asp Val Glu Val Arg Val Pro Glu Asn His 

820 825 830 

Ala Ser Lys Val Glu Gly Asp Asn Thr Lys Glu Arg Asp Leu Asp Arg 

835 840 845 

Ala Ser Glu Lys Val Glu Pro Arg Asp Glu Asp Leu Val Val Ala Gin 

850 855 860 

Gin He Asn Ala Gin Arg Pro Glu Pro GIq Ser Asp Asn Asp Ser Ser 
865 870 875 880 

Ala Thr Cys Ser Ala Asp Glu Asp Val Asp Gly Glu Pro Glu Arg Gin 

885 890 895 

Arg Met Phe Pro Met Asp Ser Lys Pro Ser Leu Leu Asn Pro Thr Gly 

900 905 910 

Ser He Leu Val Ser Ser Pro Leu Lys Pro Asn Pro Leu Asp Leu Pro 

915 920 925 

Gin Leu Gin His Arg Ala Ala Val He Pro Pro Met Val Ser Cys Thr 

930 935 940 

Pro Cys Asn He Pro He Gly Thr Pro Val Ser Gly Tyr Ala Leu Tyr 
945 950 955 960 

Gin Arg His He Lys Ala Met His Glu Ser Ala Leu Leu Glu Glu Gin 

965 970 975 

Arg Gin Arg Gin Glu Gin He Asp Leu Glu Cys Arg Ser Ser Thr Ser 

980 985 990 

Pro Cys Gly Thr Ser Lys Ser Pro Asn Arg Glu Trp Glu Val Leu Gin 

995 1000 1005 

Pro Ala Pro His Gin Leu He Thr Asn Leu Pro Glu Gly Val Arg Leu 

1010 1015 1020 

Pro Thr Thr Arg Pro Thr Arg Pro Pro Pro Pro Leu He Pro Ser Ser 
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1025 1030 1035 1040 

Lys Thr Thr Val Ala Ser Glu Lys Pro Ser Phe He Met Gly Gly Ser 

1045 1050 1055 

He Ser Gin Gly Thr Pro Gly Thr Tyr Leu Thr Ser His Asn Gin Ala 

1060 1065 1070 

Ser Tyr Thr Gin Glu Thr Pro Lys Pro Ser Val Gly Ser He Ser Leu 

1075 1080 1085 

Gly Leu Pro Arg Gin Gin Glu Ser Ala Lys Ser Ala Thr Leu Pro Tyr 

1090 1095 1100 

He Lys Gin Glu Glu Phe Ser Pro Arg Ser Gin Asn Ser Gin Pro Glu 
1105 1110 1115 1120 

Gly Leu Leu Val Arg Ala Gin His Glu Gly Val Val Arg Gly Thr Ala 

1125 1130 1135 

Gly Ala He Gin Glu Gly Ser He Thr Arg Gly Thr Pro Thr Ser Lys 

1140 1145 1150 

He Ser Val Glu Ser He Pro Ser Leu Arg Gly Ser He Thr Gin Gly 

1155 1160 1165 

Thr Pro Ala Leu Pro Gin Thr Gly He Pro Thr Glu Ala Leu Val Lys 

1170 1175 1180 

Gly Ser He Ser Arg Met Pro He Glu Asp Ser Ser Pro Glu Lys Gly 
1185 1190 1195 1200 

Arg Glu Glu Ala Ala Ser Lys Gly His Val He Tyr Glu Gly Lys Ser 

1205 1210 1215 

Gly His He Leu Ser Tyr Asp Asn He Lys Asn Ala Arg Glu Gly Thr 

1220 1225 1230 

Arg Ser Pro Arg Thr Ala His Glu He Ser Leu Lys Arg Ser Tyr Glu 

1235 1240 1245 

Ser Val Glu Gly Asn He Lys Gin Gly Met Ser Met Arg Glu Ser Pro 

1250 1255 1260 

Val Ser Ala Pro Leu Glu Gly Leu He Cys Arg Ala Leu Pro Arg Gly 
1265 1270 1275 1280 

Ser Pro His Ser Asp Leu Lys Glu Arg Thr Val Leu Ser Gly Ser He 

1285 1290 1295 

Met Gin Gly Thr Pro Arg Ala Thr Thr Glu Ser Phe Glu Asp Gly Leu 

1300 1305 1310 

Lys Tyr Pro Lys Gin He Lys Arg Glu Ser Pro Pro He Arg Ala Phe 

1315 1320 1325 

Glu Gly Ala He Thr Lys Gly Lys Pro Tyr Asp Gly He Thr Thr He 

1330 1335 1340 

Lys Glu Met Gly Arg Ser He His Glu He Pro Arg Gin Asp He Leu 
1345 1350 1355 1360 

Thr Gin Glu Ser Arg Lys Thr Pro Glu Val Val Gin Ser Thr Arg Pro 

1365 1370 1375 

He He Glu Gly Ser He Ser Gin Gly Thr Pro He Lys Phe Asp Asn 

1380 1385 1390 

Asn Ser Gly Gin Ser Ala He Lys His Asn Val Lys Ser Leu He Thr 

1395 1400 1405 

Gly Pro Ser Lys Leu Ser Arg Gly Met Pro Pro Leu Glu He Val Pro 

1410 1415 1420 

Glu Asn He Lys Val Val Glu Arg Gly Lys Tyr Glu Asp Val Lys Ala 
1425 1430 1435 1440 

Gly Glu Thr Val Arg Ser Arg His Thr Ser Val Val Ser Ser Gly Pro 

1445 1450 1455 

Ser val Leu Arg Ser Thr Leu His Glu Ala Pro Lys Ala Gin Leu Ser 

1460 1465 1470 

Pro Gly He Tyr Asp Asp Thr Ser Ala Arg Arg Thr Pro Val Ser Tyr 

1475 1480 1485 

Gin Asn Thr Met Ser Arg Gly Ser Pro Met Met Asn Arg Thr Ser Asp 

1490 1495 1500 

Val Thr He Pro Pro Asn Lys Ser Thr Asn His Glu Arg Lys Ser Thr 
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1505 1510 1515 1520 

Leu Thr Pro Thr Gin Arg Glu Ser He Pro Ala Lys Ser Pro Val Pro 

1525 1530 1535 

Gly Val Asp Pro Val Val Ser His Ser Pro Phe Asp Pro His His Arg 

1540 1545 1550 

Gly Ser Thr Ala Gly Glu Val Tyr Trp Ser His Leu Pro Thr Gin Leu 

1555 1560 1565 

Asp Pro Ala Met Pro Phe His Arg Ala Leu Asp Pro Ala Ala Ala Ala 

1570 1575 1580 

Tyr Leu Phe Gin Arg Gin Leu Ser Pro Thr Pro Gly Tyr Pro Ser Gin 
1585 1590 1595 1600 

Tyr Gin Leu Tyr Ala Met Glu Asn Thr Arg Gin Thr He Leu Asn Asp 

1605 1610 1615 

Tyr He Thr Ser Gin Gin Met Gin Val Asn Leu Arg Pro Asp Val Ala 

1620 1625 1630 

Arg Gly Leu Ser Pro Arg Glu Gin Pro Leu Gly Leu Pro Tyr Pro Ala 

1635 1640 1645 

Thr Arg Gly He He Asp Leu Thr Asn Met Pro Pro Thr He Leu Val 

1650 1655 1660 

Pro His Pro Gly Gly Thr Ser Thr Pro Pro Met Asp Arg He Thr Tyr 
1665 1670 1675 1680 

He Pro Gly Thr Gin He Thr Phe Pro Pro Arg Pro Tyr Asn Ser Ala 

1685 1690 1695 

Ser Met Ser Pro Gly His Pro Thr His Leu Ala Ala Ala Ala Ser Ala 

1700 1705 1710 

Glu Arg Glu Arg Glu Arg Glu Arg Glu Lys Glu Arg Glu Arg Glu Arg 

1715 1720 1725 

He Ala Ala Ala Ser Ser Asp Leu Tyr Leu Arg Pro Gly Ser Glu Gin 

1730 1735 1740 

Pro Gly Arg Pro Gly Ser His Gly Tyr Val Arg Ser Pro Ser Pro Ser 
1745 1750 1755 1760 

Val Arg Thr Gin Glu Thr Met Leu Gin Gin Arg Pro Ser Val Phe Gin 

1765 1770 1775 

Gly Thr Asn Gly Thr Ser Val He Thr Pro Leu Asp Pro Thr Ala Gin 

1780 1785 1790 

Leu Arg He Met Pro Leu Pro Ala Gly Gly Pro Ser He Ser Gin Gly 

1795 1800 1805 

Leu Pro Ala Ser Arg Tyr Asn Thr Ala Ala Asp Ala Leu Ala Ala Leu 

1810 1815 1820 

Val Asp Ala Ala Ala Ser Ala Pro Gin Met Asp Val Ser Lys Thr Lys 
1825 1830 1835 1840 

Glu Ser Lys His Glu Ala Ala Arg Leu Glu Glu Asn Leu Arg Ser Arg 

1845 1850 1855 

Ser Ala Ala Val Ser Glu Gin Gin Gin Leu Glu Gin Lys Thr Leu Glu 

I860 1865 1870 

Val Glu Lys Arg Ser Val Gin Cys Leu Tyr Thr Ser Ser Ala Phe Pro 

1875 1880 1885 

Ser Gly Lys Pro Gin Pro His Ser Ser Val Val Tyr Ser Glu Ala Gly 

1890 1895 1900 

Lys Asp Lys Gly Pro Pro Pro Lys Ser Arg Tyr Glu Glu Glu Leu Arg 
1905 1910 1915 1920 

Thr Arg Gly Lys Thr Thr He Thr Ala Ala Asn Phe He Asp Val He 

1925 1930 1935 

He Thr Arg Gin He Ala Ser Asp Lys Asp Ala Arg Glu Arg Gly Ser 

1940 1945 1950 

Gin Ser Ser Asp Ser Ser Ser Ser Leu Ser Ser His Arg Tyr Glu Thr 

1955 1960 1965 

Pro Ser Asp Ala He Glu Val He Ser Pro Ala Ser Ser Pro Ala Pro 

1970 1975 1980 

Pro Gin Glu Lys Leu Gin Thr Tyr Gin Pro Glu Val Val Lys Ala Asn 
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1985 1990 1995 2000 

Gin Ala Glu Asn Asp Pro Thr Arg Gin Tyr Glu Gly Pro Leu His His 

2005 2010 2015 

Tyr Arg Pro Gin Gin Glu Ser Pro Ser Pro Gin Gin Gin Leu Pro Pro 

2020 2025 2030 

Ser Ser Gin Ala Glu Gly Met Gly Gin Val Pro Arg Thr His Arg Leu 

2035 2040 2045 

lie Thr Leu Ala Asp His lie Cys Gin lie lie Thr Gin Asp Phe Ala 

2050 2055 2060 

Arg Asn Gin Val Ser Ser Gin Thr Pro Gin Gin Pro Pro Thr Ser Thr 
2065 2070 2075 2080 

Phe Gin Asn Ser Pro Ser Ala Leu Val Ser Thr Pro Val Arg Thr Lys 

2085 2090 2095 

Thr Ser Asn Arg Tyr Ser Pro Glu Ser Gin Ala Gin Ser Val His His 

2100 2105 2110 

Gin Arg Pro Gly Ser Arg Val Ser Pro Glu Asn Leu Val Asp Lys Ser 

2115 2120 2125 

Arg Gly Ser Arg Pro Gly Lys Ser Pro Glu Arg Ser His Val Ser Ser 

2130 2135 2140 

Glu Pro Tyr Glu Pro lie Ser Pro Pro Gin Val Pro Val Val His Glu 
2145 2150 2155 2160 

Lys Gin Asp Ser Leu Leu Leu Leu Ser Gin Arg Gly Ala Glu Pro Ala 

2165 2170 2175 

Glu Gin Arg Asn Asp Ala Arg Ser Pro Gly Ser lie Ser Tyr Leu Pro 

2180 2185 2190 

Ser Phe Phe Thr Lys Leu Glu Asn Thr Ser Pro Met Val Lys Ser Lys 

2195 2200 2205 

Lys Gin Glu lie Phe Arg Lys Leu Asn Ser Ser Gly Gly Gly Asp Ser 

2210 2215 2220 

Asp Met Ala Ala Ala Gin Pro Gly Thr Glu lie Phe Asn Leu Pro Ala 
2225 2230 2235 2240 

Val Thr Thr Ser Gly Ser Val Ser Ser Arg Gly His Ser Phe Ala Asp 

2245 2250 2255 

Pro Ala Ser Asn Leu Gly Leu Glu Asp lie He Arg Lys Ala Leu Met 

2260 2265 2270 

Gly Ser Phe Asp Asp Lys Val Glu Asp His Gly Val Val Met Ser Gin 

2275 2280 2285 

Pro Met Gly Val Val Pro Gly Thr Ala Asn Thr Ser Val Val Thr Ser 

2290 2295 2300 

Gly Glu Thr Arg Arg Glu Glu Gly Asp Pro Ser Pro His Ser Gly Gly 
2305 2310 2315 2320 

Val Cys Lys Pro Lys Leu lie Ser Lys Ser Asn Ser Arg Lys Ser Lys 

2325 2330 2335 

Ser Pro He Pro Gly Gin Gly Tyr Leu Gly Thr Glu Arg Pro Ser Ser 

2340 2345 2350 

Val Ser Ser Val His Ser Glu Gly Asp Tyr His Arg Gin Thr Pro Gly 

2355 2360 2365 

Trp Ala Trp Glu Asp Arg Pro Ser Ser Thr Gly Ser Thr Gin Phe Pro 

2370 2375 2380 

Tyr Asn Pro Leu Thr Met Arg Met Leu Ser Ser Thr Pro Pro Thr Pro 
2385 2390 2395 2400 

He Ala Cys Ala Pro Ser Ala Val Asn Gin Ala Ala Pro His Gin Gin 

2405 2410 2415 

Asn Arg He Trp Glu Arg Glu Pro Ala Pro Leu Leu Ser Ala Gin Tyr 

2420 2425 2430 

Glu Thr Leu Ser Asp Ser Asp Asp 
2435 2440 



